Re: 2.4.14-pre8 stress testing

Bob Matthews (bmatthews@redhat.com)
Wed, 07 Nov 2001 10:58:19 -0500


Bob Matthews wrote:

> The problem kernel is statically linked, with HIGHMEM=64G, SMP, NFS
> client and server V3, eepro100, and the sym53c8xx driver. The machine
> is an 8xPIII configured as 8G/16G.
>
> The machine ran the test suite for ~17 hours, and then gradually began
> to slow down to the point where key presses at a virtual console took
> many seconds to echo. Eventually, the machine became unresponsive. The
> test harness clock is still ticking, and I can change VC's, but that's
> about it. Magic Sysrq doesn't give me anything except the name of the
> corresponding command. The machine does not appear to have generated
> any oops output.

Well, this is very strange indeed. Sometime during the night, this
machine came back to life and continued executing the test suite.
According to the logs, it ran the tests for approximately 7 hours, then
became unresponsive again. It is currently in the "zombie" state
described above: Magic Sysrq doesn't give anything but the name of the
command, and a shell I have opened on a VC won't echo any characters. I
guess we'll see if it resurrects itself again.

Manfred mentioned that Sysrq tries to take the task queue spinlock. Is
there some segment of code in the kernel which would cause a process to
grab the task queue spinlock and hold it for a long time under heavy
memory contention?

I should mention that the job mix induced by the test suite would be
considered unreasonable in a normal environment.

-- 
Bob Matthews
Red Hat, Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/