My theory is there is an ICMP black hole between my server and some of its
clients. Is there a tool to pinpoint that black hole if it exists?
Can anyone suggest another cause or a direction for investigation?
Why does this affect 2.4 but not 2.2?
The characteristics I've discovered so far:
From strace of the server process, each write to the network is
preceeded by a select on the output fd. The select waits for a
long time, after which the write succeeds.
The packets are received by the client a couple minutes after my
server sends them.
The clients I have tested with are win98 and winNT.
The router for both 2.4 and 2.2 servers is running 2.2.18 with
ipvs (ipvs-1.0.2-2.2.18).
That router does not block any ICMP.
The behavior occurs on the 2.4 machine whether the packets are
routed directly or are mangled by ipvs.
I've tried the same machine with both 2.4 and 2.2, as well as
another machine with just 2.2. 2.2 works. 2.4 doesn't.
Both of my servers and the router I mentioned have two tulip
network cards.
The clients I've tested with are behind a modem through earthlink.
Another I suspect to have same problem is behind a modem
through Juno.
I've tried adjusting both /proc/sys/net/ipv4/route/min_adv_mss and
/proc/sys/net/ipv4/route/min_pmtu downward. Do these require an
ifconfig down/up to take effect?
Thanks for any help anyone can supply.
Brian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/