Re: Have the 2.4 kernel memory management problems on large machines been fixed?

Andi Kleen (ak@suse.de)
22 May 2002 22:23:49 +0200


Linus Torvalds <torvalds@transmeta.com> writes:

> bigpage = alloc_bigpage_from_magic_zone();

How should magic zone be handled? Do you propose to just size it with
__setup() and not put anything smaller than 4MB pages into it ?
Otherwise fragmentation will likely kill it quickly.

It would be still a bit ugly that the memory couldn't be used for anything
else. I guess that would be ok for a pure Oracle hack, but even for a pure
Oracle hack it would be awfully special purpose and hard to use
(needed a reboot for tuning and lots of memory potentially usable)

[BTW if you wanted to make it a truly bad Oracle hack(tm) then you could even
add a mode where there are no struct page in mem_map for magic zone; after
all 32+GB machines start to get limited by the size of mem_map in low mem;
drawback is that it would need some hacks to enable RAWIO again]

One idea I had was to have a zone where you do not put any pte highmem
pages or other not easily freeable highmem pages, but only pure user pages.
Then assuming rmap was included it would be possible to do a simple
dumb defragment pass for this magic zone that frees a 4MB page by freeing
or moving smaller pages.

Corner case is mlock() - it would likely need a page move. raw io etc.
could likely be handled by just blocking on the page (under the assumption
that it should always have bounded livetime), with perhaps
some measures to avoid livelock.

Do you think something like that would be worth it or do you prefer
the really dumb version that just never tries to use the pages in
magic dumb zone for anything else?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/