We've been using this setup for some months now, but recently someone was doing
some bandwidth testing comparing the speed of the two links, and he saw that
eth1 was substantially slower than eth0. Checking the output of ifconfig, he
saw a very large number of carrier errors. Further checking showed that there
was a carrier error for every packet sent out that link.
Attempting to isolate the problem, we found that if we changed the init scripts
to configure eth1 at boot, then the problem occurred on eth0. We then made an
exact copy of the contents of the ramdisk filesystem, booted the exact same
kernel&ramdisk image, and overrode the boot args to force it to boot with the
filesystem copy as nfs-mounted root. The problem went away. We then tried
booting with nfs and mounting the same ramdisk image and then configuring the
second interface. No problems.
What it looks like is that somehow simply having a ramdisk as root is causing
either errors or detection of errors on whichever ethernet link is not
configured by the system at boot time. Interestingly, the vast majority of the
packets are actually getting through--we're seeing about a tenth of a percent
packet loss, but every packet sent generates an error.
Does anyone have any ideas what could be causing this? We've tried making a
number of different ramdisks of various sizes, but it doesn't seem to make any
difference. The bootloader code has been patched to allow for ramdisks of up to
32MB rather than the default 8MB, and it doesn't look like anything is getting
trampled at boot.
Thanks,
Chris
-- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/