Bleah.
It turned out that mere hpt370 read/write test hadn't caused it. My
colleague had launched "ping -f" on background which had immediately
triggered the oops. (When I found the oops on the screen, I initially tought
he had just left the hpt370 read/write test running and left.)
We booted and tried to reproduce it. ping -f didn't immediately trigger it,
but after a while it happened. We got a number of oopses one of which was
similar to the first one and one of which showed process table corruption
(the name of the process in the oops was a random ascii pattern.)
We also got the oops with 2.2.20+patches, so this is not a pre2 thing.
Rather, the difference is that we now ran ping -f on background.
The bad news is that all the bios setting configurations we thought stable
(that had run the hpt370 read/write test without a hitch for days) now give
oopses and corruption pretty quickly when we run ping -f on background :(.
Also, ping -f shows "...EEE.EE.EEE.." which I gather means the packets get
corrupted somewhere.
I'm not too hopeful regarding finding a set of bios settings that would fix
this. It seems the "stable" configuration we found just hid the problem, but
when we push the board further, it appears again.
The two disks on HPT370 read on parallel give about 60MB/s. Add the 10MB/s
from 3c905 to that, and we are pretty close to the 75MB/s number that I've
seen referred somewhere(1) as the maximum Via KT133 can do.
My conclusion at this point is that Via KT133 / Abit KT7-RAID pci transfer
is positively FUBAR, and no sane person should touch the bugger with a ten
foot pole. I'd be happy to be proven wrong, though.
-- v --
(1) http://www.tecchannel.de/hardware/817/1.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/