Re: Multithreaded coredump patch where?

mgross (mgross@unix-os.sc.intel.com)
Wed, 18 Dec 2002 08:13:25 -0800


On Tuesday 17 December 2002 03:05 am, Roberto Fichera wrote:
> >I haven't rebased the patches I posted back in June for a while now.
> >
> >Attached is the patch I posted for the 2.4.18 vanilla kernel.  Its a bit
> >controversial, but it seems to work for a number of folks.  Let me know if
> >you have any troubles re-basing it.
>
> Only one hunk failed on include/asm-ia64/elf.h but fixed by hand.
> Why do you say a bit controversial ? One difference that I have
> notice is in coredump size after your patch. However seem to be
> working well for now. I'll try later on a SMP machine.

There are 2 issues with this implementation that will likely not effect you.

First, when dumping core of MT applications with LOTS of threads the pthread
library signals all the threads in the application to exit. Sometimes
the process that is dumping core fails to suspend other threads in the
application before some exit. The result of this is that for such
applications you will not see them in the core file.

You have to work at it to see this failure. The way I reproduce this is to
run a test application with about 555 pthread threads in it and send it a
sig_quit. When I look at the core file wont have all 555 threads. SMP makes
this effect a bit more noticeable.

Ingo's design to fix this change the exit path for thread to wait for the
core file to get dumped before finishing the process clean up. I like this
approach, I just wish I thought of it ;)

Second, the controversial issue is in the way my design pauses the other
threads in the MT application. Its not semaphore lock safe. Although no
instance of the following failure has been seen, it is possible with new
kernel code.

If one of the processes in the MT application is currently holding semaphore
lock when the dumping process pauses it, AND the dumping process does any
blocking operation that could attempt to grab this same semaphore, THEN the
core dump will deadlock. Boom.

My patch is good for developers, pending the back port of Ingo's version.

Do let me know how it works out for you.

--mgross

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/