in our workgroup we run simulation jobs that may take a few days at a time
to complete. Since recently, after upgrading some boxen to 2.4.2, we have
jobs flaking out with messages like
Trace/Breakpoint trap
without core dumps. Unfortunately this happens unpredictably but
consistently after several hours. The same binaries run on (identical)
other machines (with 2.2.x) without problems. There is definitely no
debugger interference as one might think seeing these messages. Also,
hardware//overheating problems can be ruled out.
Digging around in lkml archives, I found this old (Jun 23 2000) post
concerning "2.4.0-test2-11"
>ok, things are starting to get really odd on 2.4.0-test2-11 now.
># tar xzf storage/glibc-2.1.3.tar.gz
># tar tzf storage/glibc-linuxthreads-2.1.3.tar.gz
>Trace/breakpoint trap
># tar tzf storage/glibc-linuxthreads-2.1.3.tar.gz
>Trace/breakpoint trap
># strace -f tar tzf storage/glibc-linuxthreads-2.1.3.tar.gz
>Trace/breakpoint trap
># ltrace -s 256 -S -f tar tzf storage/glibc-linuxthreads-2.1.3.tar.gz
>Trace/breakpoint trap
># ls
>Trace/breakpoint trap
># pwd
>/usr/src
># whoami
>Trace/breakpoint trap
># dmesg
>Trace/breakpoint trap
This is extreme compared to our little problem, but perhaps there is a
connection? Please enlighten me if anybody knows what is going on. Or give
me a hint on how to go about debugging this myself.
Regards
Max
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/