Re: [2.4.17/18pre] VM and swap - it's really unusable

yodaiken@fsmlabs.com
Sat, 12 Jan 2002 07:56:38 -0700


On Sat, Jan 12, 2002 at 02:25:03PM +0100, Roman Zippel wrote:
> Hi,
>
> yodaiken@fsmlabs.com wrote:
>
> > > > SCHED_FIFO leads to
> > > > niced app 1 in K mode gets Sem A
> > > > SCHED_FIFO app prempts and blocks on Sem A
> > > > whoops! app 2 in K more preempts niced app 1
> > >
> > > Please explain what's different without the preempt patch.
> >
> > See that "preempt" in line 2 . Linux does not
> > preempt kernel mode processes otherwise. The beauty of the
> > non-preemptive kernel is that "in K mode every process makes progress"
> > and even the "niced app" will complete its use of SemA and
> > release it in one run.
>
> The point of using semaphores is that one can sleep while holding them,
> whether this is forced by preemption or voluntary makes no difference.

No. The point of using semaphores is that one can sleep while
_waiting_ for the resource. Sleeping while holding semaphores is
a different kettle of lampreys entirely.
And it makes a very big difference
A:
get sem on memory pool
do something horrible to pool
release sem on memory pool

In a preemptive kernel this can cause a deadlock. In a non
preemptive it cannot. You are correct in that
B:
get sem on memory pool
do potentially blocking operations
release sem
is also dangerous - but I don't think that helps your case.
To fix B, we can enforce a coding rule - one of the reasons why
we have all those atomic ops in the kernel is to be able to
avoid this problem.
To fix A in a preemptive kernel we need to start messing about with
priorities and that's a major error.
"The current kernel has too many places where processes
can sleep while holding semaphores so we should always have the
potential of blocking with held semaphores" is, to me, a backwards
argument.

> > If you have a reasonably fair scheduler you
> > can make very useful analysis with Linux now of the form
> >
> > Under 50 active proceses in the system means that in every
> > 2 second interval every process
> > will get at least 10ms of time to run.
> >
> > That's a very valuable property and it goes away in a preemptive kernel
> > to get you something vague.
>
> How is that changed? AFAIK inserting more schedule points does not
> change the behaviour of the scheduler. The niced app will still get its
> time.

How many times can an app be preempted? In a non preempt kernel
is can be preempted during user mode at timer frequency and no more
and it cannot be preempted during kernel mode. So
while(1){
read mpeg data
process
write bitmap
}

Assuming Andrew does not get too ambitious about read/write granularity, once this
process is scheduled on a non-preempt system it will always make progress. The non
preempt kernel says, "your kernel request will complete - if we have resources".
A preempt kernel says: "well, if nobody more important activates you get some time"
Now you do the analysis based on the computation of "goodness" to show that there is
a bound on preemption count during an execution of this process. I don't want to
have to think that hard.
Let's suppose the Gnome desktop constantly creates and
destroys new fresh i/o bound tasks to do something. So with the old fashioned non
preempt (ignoring Andrew) we get
wait no more than 1 second
I'm scheduled and start a read
wait no more than one second
I'm scheduled and in user mode for at least 10milliseconds
wait no more than 1 second
I'm scheduled and do my write
...
with preempt we get
wait no more than 1 second
I'm scheduled and start a read
I'm preempted
read not done
come back for 2 microseconds
preempted again
haven't issued the damn read request yet
ok a miracle happens, I finish the read request
go to usermode and an interrupt happens
well it would be stupid to have a goodness
function in a preempt kernel that lets a low
priority task finish its time slice so preempt
...

>
> > So your argument is that I'm advocating Andrew Morton's patch which
> > reduces latencies more than the preempt patch because I have a
> > financial interest in not reducing latencies? Subtle.
>
> Andrew's patch requires constant audition and Andrew can't audit all
> drivers for possible problems. That doesn't mean Andrew's work is
> wasted, since it identifies problems, which preempting can't solve, but
> it will always be a hunt for the worst cases, where preempting goes for
> the general case.

the preempt requires constant auditing too - and more complex auditing.
After all, a missed audit in Andrew will simply increase worst case timing.
A missed audit in preempt will hang the system.

>
> > In any case, motive has no bearing on a technical argument.
> > Your motive could be to make the 68K look better by reducing
> > performance on other processors for all I know.
>
> I am more than busy to keep it running (together with a few others, who
> are left) and more important I make no money of it.

Come on! First of all, you are causing me a great deal of pain by making
me struggle not to make some bad joke about the economics of Linux companies.
More important, not making money has nothing to do with purity of motivation -
don't you read this list?
And how do I know that you haven't got a stockpile of 68K boards that may
be worth big money once it's known that 68K linux is at the top of the heap?
Much less plausible money making schemes have been tried.

Seriously: for our business, a Linux kernel that can reliably run at millisecond
level latencies is only good. If you could get a Linux kernel to run at
latencies of 100 microseconds worst case on a 486, I'd be a little more
worried but even then ...
On a 800Mhz Athlon, RTLinux scheduling jitter is 17microseconds worst case right now.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/