select() takes a long time after an EAGAIN read()?

Entrope (entrope@users.sourceforge.net)
06 Nov 2001 22:35:31 -0500

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Carlo Wood: "Re: [ircu-development] Slow on high-MTU (local host) connections?"
Previous message: Carlo Wood: "Re: [ircu-development] Slow on high-MTU (local host) connections?"

Carlo Wood diagnosed a problem that GamesNET has been seeing where
traffic between two processes on the same machine takes a lot longer
using the default configuration than the same traffic between
different machines. It's apparently entangled in the path MTU -- the
slow transfers happen when the MTU is 16 KB, as in localhost, and does
not happen when the MTU is smaller. I don't know of any reason it
should do this, and Carlo also chalks it up to kernel weirdness.

Is this a kernel bug, or is it expected (just not by us)?

-- Entrope

-- From his email (originally sent to the old lkml address): --

On Mon, Nov 05, 2001 at 09:17:52PM -0500, Entrope wrote:
> GamesNET's seeing oddly long burst times when we connect srvx to ircu
> on the same machine (~1 minute). These go away when connecting to
> another machine (connected via 100baseT; ~5 seconds).

I have reproduced this: 74 seconds with MTU == 16436 and 1.8 seconds(!)
with MTU == 8000.

I investigated this tonight and although it seems to be a kernel bug.

What typically happens is this (here process 'ircu' is bursting data
over a socket to process 'srvx2', using `localhost'):

UTC(sec.microsec) [who] syscall <time spend in syscall>

1005101096.583475 [ircu ] send(6, "or the month: instead of /msg au"..., 1092, 0) = 1092 <0.000013>
1005101096.583551 [ircu ] poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN, revents=POLLIN},
{fd=4, events=POLLIN}, {fd=2, events=POLLIN}, {fd=1, events=POLLIN},
{fd=3, events=POLLIN}, {fd=0, events=POLLIN}], 7, 36000) = 1 <0.557834>
1005101096.583681 [srvx2] read(12, "otice from def, posted 08:38 PM,"..., 218556) = 16628 <0.000125>
1005101096.583867 [srvx2] read(12, "ure because you can\'t accidently"..., 201928) = 5432 <0.000041>
1005101096.583963 [srvx2] read(12, 0x401d089a, 196496) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
1005101096.585688 [srvx2] select(13, [12], [], NULL, {13, 0}) = 1 (in [12], left {12, 440000}) <0.556093>
1005101097.141491 [ircu ] time(NULL) = 1005101097 <0.000005>
1005101097.141550 [ircu ] recv(5, "zAA NOTICE IXT :[\2users\2] Notice"..., 8192, 0) = 7948 <0.000070>

1005101096.040382 [srvx2] read(12, "ed 08:38 PM, 08/05/2001:\r\n:Globa"..., 87316) = 1898 <0.000026>
1005101096.040455 [srvx2] read(12, 0x401ec25e, 85418) = -1 EAGAIN (Resource temporarily unavailable) <0.000012>
1005101096.053036 [srvx2] select(13, [12], [], NULL, {13, 0}) = 1 (in [12], left {12, 740000}) <0.265021>
1005101096.072435 [ircu ] time(NULL) = 1005101096 <0.000009>
1005101096.072503 [ircu ] send(5, "clan-dk.org Ak6f52 Ih2 :Juped Us"..., 488, 0) = 488 <0.000067>
1005101096.072628 [ircu ] send(5, "172 3 1005101095 JupeUser21 fake"..., 2048, 0) = 2048 <0.000023>
1005101096.072699 [ircu ] send(5, "SrQjW IiT :Juped User!\r\nI NICK J"..., 2048, 0) = 2048 <0.000021>
1005101096.072767 [ircu ] send(5, "005101095 JupeUser22 fake1.clan-"..., 2048, 0) = 2048 <0.000022>
1005101096.072836 [ircu ] send(5, "jC :Juped User!\r\nI NICK JupeUser"..., 2048, 0) = 2048 <0.000024>
1005101096.072909 [ircu ] send(5, "95 JupeUser22 fake1.clan-dk.org "..., 2048, 0) = 2048 <0.000023>
1005101096.072978 [ircu ] send(5, "ed User!\r\nI NICK JupeUser2290 3 "..., 2048, 0) = 2048 <0.000020>
1005101096.073045 [ircu ] send(5, "User23 fake1.clan-dk.org ATPpVB "..., 2048, 0) = 2048 <0.000024>
1005101096.073115 [ircu ] send(5, "!\r\nI NICK JupeUser2337 3 1005101"..., 2048, 0) = 1560 <0.000023>
1005101096.073190 [ircu ] poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN|POLLOUT, revents=POLLOUT},
{fd=4, events=POLLIN}, {fd=2, events=POLLIN}, {fd=1, events=POLLIN},
{fd=3, events=POLLIN}, {fd=0, events=POLLIN}], 7, 36000) = 1 <0.233812>
1005101096.307098 [ircu ] time(NULL) = 1005101096 <0.000005>
1005101096.307154 [ircu ] send(5, "d User!\r\nI NICK JupeUser2355 3 1"..., 488, 0) = 488 <0.000035>

So basically, the long delay inside poll/select happens when
svrx enters select() *after* it got an EAGAIN error on a read().

I suggest to fix this for now by changing srvx to do the following:

- call read() with a small buffer (8 kb or something) and continue
to call it until an ammount is returned that is less than this 8 kb.
Then call select() again to see if there is more data.

The point is that when read() returns less than what you ask for,
there will not be more data -- unless you ask for ridiculous ammounts
of data, as you do; then of course it can never return that ammount.

-- 
Carlo Wood <carlo@alinoe.com>

PS CC to kernel list, because I think this delay should not be there.
   If anyone on the kernel list knows what causes this, then please
   CC me as I am not subscribed to this list.

-- Following up on that: --

The examples in the previous mail are from the case with MTU is 8000
(and were to only two occurances of a EAGAIN for read() actually).

Allow me show the statistics for both MTU's:

MTU 16436:

~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu16436 | grep '\[srvx2\] read' | wc --lines
    323
~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu16436 | grep '\[srvx2\] read.*EAGAIN' | wc --lines
    323

Conclusion: ALL calls to select() that took longer than 0.1 second
were following a call to read() that failed with EAGAIN.  In total 323 times.


MTU 8000:

~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu8000 | grep '\[srvx2\] read' | wc --lines
      2
~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu8000 | grep '\[srvx2\] read.*EAGAIN' | wc --lines
      2

Idem, but only two occurences.


The total number of calls to select in both are respectively:

~>grep '\[srvx2\] select.*' mtu16436 | wc --lines
   1221
~>grep '\[srvx2\] select.*' mtu8000 | wc --lines
   658
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Carlo Wood: "Re: [ircu-development] Slow on high-MTU (local host) connections?"
Previous message: Carlo Wood: "Re: [ircu-development] Slow on high-MTU (local host) connections?"