The OOM killer never ever kills anything when the machine _isn't_ OOM.
That is not its job. The OOM killer is there fix one particular
crisis: When memory is needed for further processing but none
at all exists. It kills a process when the alternative is
a kernel crash.
The OOM killer is not there to make your machine perform
reasonably, it is not a load control measure.
> We can argue any old factors for selection, but I would
> first argue that the real problem is that nothing was killed because the
> problem was not noticed.
What I am saying is that you need another killer. The machine wasn't
OOM,
so of course the OOM killer didn't notice. It was merely using
its memory in a stupid way, causing extremely bad performance. It isn't
OOM when there's 600M in buffers - all those may be freed.
Fixing this case would be nice. But overload scenarios are
still possible, so what you want is probably an overload killer.
> One possible way to recognize the problem is to identify the ratio of
> page faults to time slice used and assume there is trouble in River City
> if that gets high and stays high. I leave it to the VM gurus to define
> "high," but processes which continually block for page fault as opposed
> to i/o of some kind are an indication of problems, and likely to be a
> factor in deciding what to kill.
Note that it is possible to have a machine that perform excellent
even if one process is trashing to hell (and spending weeks on
a 5-minute task due to trashing.)
How? This is possible if the process isn't allowed to use more
than some reasonable fraction of RAM. It can swap a lot if it
needs more, but other, more reasonable processes will run
at full speed and not swap and get enough cache for file io.
(You definitely want
swap on a separate spindle in this case, or you loose IO
performance for the other processes.)
I believe som os'es, like VMS, can do this.
The problem with this approach is administration. There is no
automatic way to estimate how much RAM is reasonable for a process.
A big simulation with no IO can reasonably use 99% of the memory on
a dedicated machine. But doing that would kill both desktop
and server machines. So administrators would have to set memory
quotas for every process, which is a lot of work.
And you may have to set quotas for every run - so you can't
just stick it in a script. Gcc is one example - I have memory
enough to run several in parallel for a kernel compile,
but I run only one for a big C++ compile.
Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/