Sep 22 14:25:34 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:34 db001 kernel: scsi0: PCI error Interrupt at seqaddr = 0x8
Sep 22 14:25:34 db001 kernel: eth0: PCI bus error, bus status 800000a0
Sep 22 14:25:34 db001 kernel: scsi0: Data Parity Error Detected during
address or write data phase
Sep 22 14:25:34 db001 kernel: eth0: using NWAY device table, not 8
Sep 22 14:25:34 db001 kernel: scsi1: PCI error Interrupt at seqaddr = 0x8
Sep 22 14:25:34 db001 kernel: scsi1: Data Parity Error Detected during
address or write data phase
Sep 22 14:25:34 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:34 db001 kernel: eth0: PCI bus error, bus status 80000020
Sep 22 14:25:34 db001 kernel: eth0: using NWAY device table, not 8
Sep 22 14:25:35 db001 kernel: eth0: Host error, FIFO diagnostic register
0000.
Sep 22 14:25:35 db001 kernel: eth0: PCI bus error, bus status 80000020
Sep 22 14:25:35 db001 kernel: eth0: using NWAY device table, not 8
And the last 3 lines repeat for roughly 300000 lines. Stripping the
timestamps and doing an egrep 'eth|scsi' /var/log/messages | sort |
uniq of those lines, I got:
NETDEV WATCHDOG: eth0: transmit timed out
eth0: 3Com PCI 3c980 10/100 Base-TX NIC(Python-T) at 0x1400, 00:e0:81
eth0: 3Com PCI 3c980 10/100 Base-TX NIC(Python-T) at 0x1c00, 00:e0:81
eth0: Host error, FIFO diagnostic register 0000.
eth0: Interrupt posted but not delivered -- IRQ blocked by another dev
eth0: PCI bus error, bus status 80000020
eth0: PCI bus error, bus status 800000a0
eth0: Resetting the Tx ring pointer.
eth0: Too much work in interrupt, status e003.
eth0: transmit timed out, tx_status 00 status 7003.
eth0: transmit timed out, tx_status 00 status 7043.
eth0: using NWAY device table, not 8
scsi0: Data Parity Error Detected during address or write data phase
scsi0: PCI error Interrupt at seqaddr = 0x8
scsi1: Data Parity Error Detected during address or write data phase
scsi1: PCI error Interrupt at seqaddr = 0x8
Looking in /proc/interrupts, I noticed that eth0 and dpti were sharing
an IRQ. Is this the likely cause of the network failure, and if so,
does anyone know of a way to get the PCI BIOS to assign separate IRQs to
the RAID card and the dual 3com? (I have a Tyan S2462 Thunder K7 board
with nothing in the manual about this.) I have disabled onboard SCSI
(dual AIC7xxx), serial, and parallel as well as pulled the RAID card
from the machine and power-cycled a few times, but when I put it back
in, it's sharing an IRQ with the 3com again (I suppose I should try
disabling/enabling the 3coms too).
A related question is: should these drivers be able to share IRQs, i.e.
is it a worthwhile goal to have them operate reliably while sharing
IRQs, or is IRQ-sharing a performance loss and something to be avoided?
Thanks,
Brian Strand
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/