Re: [Bug report] System lockups on Tyan S2469 and lots of io [smp boot time problems too :(]

Vincent Touquet (vincent.touquet@pandora.be)
Tue, 8 Jul 2003 23:16:22 +0200


Another way to lockup the system:
dd if=/dev/zero of=/array/file bs=1024k count=10000

So now I didn't even use any code that came near ide
(unless you take into account swapping ?)

The process ends again in a hangup:
Jul 8 22:54:55 kalimero kernel: 3w-xxxx: scsi0: AEN drain failed,
retrying.
Jul 8 22:54:55 kalimero kernel: 3w-xxxx: scsi0: Controller errors, card
not responding, check all cabling.
Jul 8 22:54:55 kalimero kernel: 3w-xxxx: scsi0: Reset sequence failed.
Jul 8 22:54:55 kalimero kernel: 3w-xxxx: scsi0: Unit #0: Command
(f7c1cc00) timed out, resetting card.

Some interesting bits in the traces show the scsi being in a limbo:
Jul 8 22:54:55 kalimero kernel: kupdated D 00000046 5052 7
1 8 6 (L-TLB)
Jul 8 22:54:55 kalimero kernel: Call Trace:
[call_reschedule_interrupt+5/11] [__down+192/352] [__down_failed+11/20]
[.text.lock.super+279/518] [sync_old_buffers+102/336]
Jul 8 22:54:55 kalimero kernel: [kupdate+418/480] [kupdate+0/480]
[arch_kernel_thread+46/64] [kupdate+0/480]
Jul 8 22:54:55 kalimero kernel: scsi_eh_0 R F7C64080 5760 8
1 9 7 (L-TLB)
Jul 8 22:54:55 kalimero kernel: Call Trace:
[tw_scsi_eh_abort+504/768] [scsi_try_to_abort_command+136/208]
[__down_interruptible+373/416]
[scsi_unjam_host+2045/2672] [scsi_error_handler+376/608]
Jul 8 22:54:55 kalimero kernel: [arch_kernel_thread+46/64]
[scsi_error_handler+0/608]

And of course the dd process is in state 'D'...

I should start browsing the sources for these scsi_* functions.

I would really like to know if I'm looking at a software or a hardware
issue here.

best regards,

Vincent
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/