Re: 2.4.7p6 hang

Andrea Arcangeli (andrea@suse.de)
Wed, 11 Jul 2001 18:53:08 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Jonathan Lundell: "Re: [PATCH] Re: Discrepancies between /proc/cpuinfo and Dave J's"
Previous message: Andrea Arcangeli: "Re: optimised rw-semaphores for MIPS/MIPS64"

On Wed, Jul 11, 2001 at 06:30:43PM +0200, Trond Myklebust wrote:
> >>>>> " " == Andrea Arcangeli <andrea@suse.de> writes:
>
> > ksoftirqd is quite scheduler intensive, and while its startup
> > is correct (no need of any change there), it tends to trigger
> > scheduler bugs (one of those bugs was just fixed in pre5). The
> > reason I never seen the deadlock I also fixed this other
> > scheduler bug in my tree:
>
> > --- 2.4.4aa3/kernel/sched.c.~1~ Sun Apr 29 17:37:05 2001
> > +++ 2.4.4aa3/kernel/sched.c Tue May 1 16:39:42 2001
> > @@ -674,8 +674,10 @@
> > #endif
> > spin_unlock_irq(&runqueue_lock);
>
> > - if (prev == next)
> > + if (prev == next) {
> > + current->policy &= ~SCHED_YIELD;
> > goto same_process;
> > + }
>
> > #ifdef CONFIG_SMP
> > /*
>
> I no longer see the hang with this patch, but I'm not sure I
> understand why it works.

I do. It's very subtle and it goes down to the fork and scheduler
details.

> Does the above mean that the hang is occuring because spawn_ksoftirqd
> is yielding back to itself? If so, the semaphore trick seems more

No, that's a generic bug.

> robust, as it causes a proper sleep until it's safe to wake up.

rwsem is definitenly not more robust than the current code, if something
it hides if sched_yield is broken in the scheduler. no need to change
it wasting some static ram for a rwsem for no good reason.

The bug is that sched_yield must always be cleared at the time of a
fork() or the child may never get schedule. Only tasks running in-cpu are
allowed to have SCHED_YIELD set.

Another way to cure the deadlock could be to clear SCHED_YIELD in the child so
then you could even do something as silly as:

current->policy |= SCHED_YIELD;
fork()
schedule()

but the above doesn't make sense so we can optimize away the clear of
SCHED_YIELD of the child in fork. And even if you allow the above you
still need my attached fix for performance reason because if schedule()
returns that's all for the last sched_yield try, the next time we run
schedule without specifying sched_yield we don't want it to be threated
like a sched_yield again (that was the original reason of the patch
infact, I noticed now that the bug had very serious implication with
fork, such implication won't trigger only with ksoftirqd but also with
normal userspace forks, it's only that with ksoftirqd banging of the
scheduler it becomes reproducible).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jonathan Lundell: "Re: [PATCH] Re: Discrepancies between /proc/cpuinfo and Dave J's"
Previous message: Andrea Arcangeli: "Re: optimised rw-semaphores for MIPS/MIPS64"