Shawn> It just caused sshd to hang. I don't know why Here's what
Shawn> strace reports: Sshd is stuck in 'D' and a child in zombie
Shawn> state. The machine has been up for 2 days 18 hours 50 mins.
I may have missed some emails on this topic, so I apologize if this
objection was already answered. In any case, if you're running with
Andrew's patch that added a del_timer_sync to release_dev(), I think I
see how that could cause problems. Here's what I said in my previous
email:
From looking at workqueue.c (especially the comment in
queue_delayed_work() that says "Increase nr_queued so that the flush
function knows that there's something pending."), it seems like
flush_scheduled_work() should wait until even delayed work is done.
Given that, I don't think the del_timer_sync() should be there --
wouldn't flush_scheduled_work() block forever, since nr_queued can
never reach 0 now?
Also, I didn't see an answer to the worry I expressed below:
I don't see how it's _ever_ safe to call flush_scheduled_work().
The comment in workqueue.c before flush_workqueue() says "NOTE: if
work is being added to the queue constantly by some other context
then this function might block indefinitely." But
flush_scheduled_work() is flushing the keventd_wq, which other code
will definitely add work to. If we're unlucky,
flush_scheduled_work() could block forever. Am I just being
paranoid?
I admit that flush_scheduled_work() is unlikely to cause problems in
most real world situations, but I'd be curious to know if anyone
thinks this could ever lead to a livelock under some strange condition.
- Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/