Yes, they start back up after the dump.
It certainly seems that with the processes paused that the use of the
current->mm->mm_sem could be obsolete for core dumps. I'm not so sure
protecting the core file data from ptrace or /proc/pid/mem is important in
the case of core dumping.
I just don't want the kernel to lock up dumping the multithreaded core file.
I'm still not sure we have a problem yet. (wishful thinking I suppose).
Also I've seen zero lock ups from semaphore being held by one of the
processes getting pauses temporarily in my testing on the patch I posted.
To restate: the only way I see that my design gets into trouble is when a
semaphore is HELD, not getting waited on, by one of the processes that gets
put onto the phantom runqueue, AND that semaphore is needed in the processing
of elf_core_dump(...).
For this to happen that semaphore would have to held across schedule()'s.
The ONLY place I've seen that in the kernel is set_CPUs_allowed +
migration_thread.
Can someone point me at other critical sections that have non-deterministic
life times as a function of when the process holding the semaphore gets
scheduled onto a CPU? That type of code seems very risky to me. This is the
only type of code that could get my design into trouble.
--mgross
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/