2.4.10 VM, active cache pages, and OOM

Tobias Ringstrom (tori@ringstrom.mine.nu)
Sat, 29 Sep 2001 18:19:13 +0200 (CEST)


First I'd like to say that the 2.4.10 VM works great for my desktop and
home server, much better than previous versions. I have not tried Alan's
kernels.

I do have one problem, though, and it is illustrated by the following very
simple program:

#include <unistd.h>
int main()
{
char buf[512];
while (read(0, buf, sizeof(buf)) == sizeof(buf))
;
return 0;
}

The program should be reading a block device, but a big file probably does
the trick as well.

./a.out < /dev/hde1

When the program is running, all cached pages pop up in the active list,
and when the memory is full of active pages, the computer starts to page
out stuff, becomes VERY unresponsive, and after half a minute or so it
goes OOM and starts killing processes. There are lots and lots of free
swap at this time. I also get a bunch of 0-order allocation failures in
the log.

(I'd say that the OOM killer does seem to kill the most memory-hog-like
processes, but the problem is that it is not the processes that use up all
the memory, it is the active cache pages.)

If the buf size is changed to a multiple of the page size, such as 4096,
the cache pages are instead added to the inactive list, and the system is
very responsive, no paging occurs, and it does not go OOM. In other
words, it works perfectly.

I assume that the difference between a buf size of 512 and 4096 is that
for the 512-byte case, each page is touched more than once, and that's why
the system think the pages are active. This is a very wrong decision,
since I'm doing a sequential read.

Fixing that particular problem will get rid of my problem, but I'm
guessing that it would only hide another real problem, which is that
2.4.10 has a huge problem freeing pages from the list of active pages,
even if they are clean, and thus making a wrong decision on the
availibility of free(able) pages.

Am I right to assume that if I would make the program do random seeks, or
read each page twice, the pages would again be added to the active list,
even if I would read whole pages at a time?

I also wonder why the system get so unresponsive before it goes OOM.
Perhaps there is a kernel process running, scanning lists trying to free
memory, but not finding any, wasting all CPU cycles.

What do you think?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/