Pretty much every single time, release_task() has been there on the
backtrace.
In fact, I bet you this code in do_exit() is the cause:
preempt_disable();
if (tsk->exit_signal == -1)
*** release_task(tsk); ***
schedule();
Note how "release_task()" will be releasing the stack that the process is
running on right now. And the reason it doesn't crash _every_ time is
simply that you need to have:
- another memory allocation that picks up that page and fills it with
something else in order to get a corrupted stack
- and something delays schedule() so that you have time to race _and_ you
need the stack. Which is why most of the oopses have an interrupt come
in inside schedule (see the "common_interrupt()" thing
In other words, I think we need to have schedule_tail() do the
release_task(), otherwise we'd release it too early while the task
structure (and the stack) are both still in use.
You owe me a patch.
Linus
---> [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010bc28>] handle_IRQ_event+0x38/0x60 > [<c010bf6b>] do_IRQ+0x14b/0x1e0 > [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010bf6b>] do_IRQ+0x14b/0x1e0 > [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010bf6b>] do_IRQ+0x14b/0x1e0 > [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c011e06c>] __put_task_struct+0x7c/0x90 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010bc28>] handle_IRQ_event+0x38/0x60 > [<c010bf6b>] do_IRQ+0x14b/0x1e0 > [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
> [<c010bc28>] handle_IRQ_event+0x38/0x60 > [<c010bf6b>] do_IRQ+0x14b/0x1e0 > [<c010a594>] common_interrupt+0x18/0x20 > [<c010a691>] error_code+0x2d/0x38 > [<c011b881>] schedule+0x3a1/0x3d0 > [<c01219fd>] release_task+0x17d/0x200 > [<c011e70f>] mmput+0x1f/0xc0 > [<c0122cad>] do_exit+0x31d/0x3b0
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/