ah but you get an oops! That has nothing to do with the bug triggered by
the google workload. I thought "the system locked" above meant it was a
live lock like it was happening before fixing the kernel bug that was
generating the live lock with the google workload. I couldn't imagine
"the system locked" meant "I got an oops", sorry.
So let's check the Oops:
Unable to handle kernel NULL pointer dereference at virtual address
00000200
printing eip:
00000200
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<00000200>] Not tainted
EFLAGS: 00010202
eax: 00000001 ebx: c02b77a0 ecx: c8800000 edx: 00000002
esi: 00000002 edi: c10456c0 ebp: c025b4c0 esp: c7fe7f20
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 4, stackpage=c7fe7000)
Stack: c10456a4 c0126911 c02b77a0 c7fe6000 0000060e c025b610 0000004e
0000001d
c11e7d50 c0820000 c11e7b90 00000001 0000001f c025b610 000001d0
c025b610
c0126b94 000001d0 c7fe7f88 00000020 00000020 000001d0 c0126bff
c7fe7f88
Call Trace: [<shrink_cache+0x321>] [<shrink_caches+0x64>]
[<try_to_free_pages+0x5f>] [<kswapd_balance_pgdat+0x51>]
[<kswapd_balance+0x16>]
[<kswapd+0xa1>] [<kswapd+0x0>] [<kernel_thread+0x2b>]
this sounds like faulty hardware. Are you sure your ram is not buggy?
I will start now an infinite loop of the google workload. How many runs
it takes exactly before you can reproduce btw?
Before fixing the bug I could reproduce the live lock just during the
first run. So to verify the problem was fixed I run 3/4 runs of the
workload but never more than this (mainly because of the pause() that
forced me to C^c without automating the loop, I will now drop the
pause() and start an infinite loop).
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/