http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.5.2-final-I0.patch
stock 2.5.2 includes a 'interactivity estimator' method that includes most
of the things i think to be important for good interactivity:
- sleep time based priority boost/penalty.
- constant frequency runqueue sampling instead of recalculation/switch
based runqueue sampling.
- interactivity based runqueue insertion on timeslice expire.
I'm very happy about the 2.5.2 solution, it's simpler than the one i used
in -H7 - good work Davide!
There are a number of problems in 2.5.2 that need fixing though:
- renicing is broken - it does not work at all, neither up nor down, for
CPU-bound tasks. Renicing fell victim to the attempt to penalize CPU
hogs as much as possible: every CPU-bound task reaches the lowest
priority level and stays there. This also makes kernel compile times
suffer.
- RT scheduling is broken.
- the sleep average is hidden in p->prio, which makes it harder to
recover and use the true interactiveness of the task.
- the runqueue is sampled at a frequency of 20 HZ, which can misdetect
periodic user tasks that somehow correlate with 20 HZ.
I've fixed these problems/bugs by taking some of the -H7 solutions:
- introducing p->sleep_avg, which is updated in a lightweight way. No
more 'history slots'. A single counter, updated in a very simple way.
- limiting the bonus/penalty range according to nice levels - a task can
at most get a 5 priority levels penalty over the default level, in
stock 2.5.2 it can get to the nice +19 level after a few seconds
runtime. Nice levels work again.
- introducing HZ frequency runqueue sampling. Also the MAX_SLEEP_AVG
constant tells us how long into the past we are looking. This is 2
seconds right now.
- separating the RT timeslice code in scheduler_tick(), we used to break
the RT case way too often, now we can hack the SCHED_OTHER code without
having to touch the RT part.
- plus the patch also includes all the fixes and improvements from the
-H7 patch.
i've also cleaned up and commented the priority management code and have
introduced the prio_effective(p) inline function.
i've tested the patch on UP and SMP boxes. I've measured high-load
interactivity to be on equivalent levels with that of stock 2.5.2.
Bug reports, comments, suggestions welcome.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/