when the page is exclusive we definitely can write to it, dropping the
page from the swapcache and lefting the pte wrprotected just asks for a
cow page fault that will simply alloc another page, copy the old one
into the new one and finally free (really free) the old one. so I think
that part is correct.
> (and please remove that hesitant "#if 1" and "#if 0" from memory.c).
The #if 1 is for the persistence option "the swap waste thing". If you
can afford to waste swap space you can possibly swapout anonymous pages at
no I/Ocost, and also get virtually consecutive addresses more likely to
be physically consecutive on disk too, and I cannot exclude somebody
with very huge swapspace can afford to keep all its anonymous pages in
swapcache as well. This is why I didn't dropped it (yet). But I don't
think it makes an huge difference (the #if is just to make easy to
switch behaviour to give the other one a spin)
> 4. In 2.4.10-pre10, Linus removed the SetPageReferenced(page) from
> __find_page_nolock, and was adamant on lkml that it's inappropriate
> at that level. Later in the day, Linus produced 2.4.10-pre11 from
> your patches, and that SetPageReferenced(page) reappeared: oversight
> or intentional? Linus? more a question for you than Andrea.
Intentional. Really I moved it away even before Linus, I even did a patch
where readahead isn't marked referenced at all with perfect accounting
but from the numbers it seems we don't want to shrink readahead until we
do the cache hit, so I just moved things back waiting for new
experiments on that area :)
> 5. With -pre12 I'm not getting the 0-order allocation failures which
> interfered with my -pre11 testing, but I did spend a while yesterday
> looking into that, and the patch I found successful was to move the
> "int nr_pages = SWAP_CLUSTER_MAX;" in try_to_free_pages from within
> the loop to the outer level: try_to_free_pages had freed 114 pages
> of the zone, but never as many as 32 in any one go round the loop.
I see, infact it was originally written that way :). But did you also
checked OOM was still handled gracefully after that?
> You'll have your own ideas of what's right and wrong here, and I'd
Such change isn't bad, you may want to give it a spin again and check
how oom reacts and how swapout behaviour reacts. I'm not changing
anything in that area at the moment unless(/until? :) somebody complains
about performance.
> --- 2.4.10-pre12/mm/page_alloc.c Wed Sep 19 14:08:14 2001
> +++ linux/mm/page_alloc.c Wed Sep 19 16:21:46 2001
> @@ -86,8 +86,7 @@
> BUG();
> if (PageInactive(page))
> BUG();
> - if (PageDirty(page))
> - BUG();
> + page->flags &= ~((1<<PG_referenced) | (1<<PG_dirty));
>
> if (current->flags & PF_FREE_PAGES)
> goto local_freelist;
> --- 2.4.10-pre12/mm/swapfile.c Wed Sep 19 14:08:14 2001
> +++ linux/mm/swapfile.c Wed Sep 19 16:08:08 2001
> @@ -452,6 +452,7 @@
> lock_page(page);
> if (PageSwapCache(page))
> delete_from_swap_cache_nolock(page);
> + SetPageDirty(page);
> UnlockPage(page);
> flush_page_to_ram(page);
>
> @@ -492,7 +493,6 @@
> mmput(start_mm);
> start_mm = new_start_mm;
> }
> - ClearPageDirty(page);
> page_cache_release(page);
I dislike it but fine with me for now. BTW, I was aware I wasn't really
correct in such change, see the first description of the vm patch:
I probably have a bug in swapoff but let's ignore it for now, just
try to run swapoff only before shutting down the machine. The fact
is that the 2.4 VM is broken freeing physically dirty pages.
The last owner of the page (usually the VM except in swapoff) has to
clear the dirty flag before freeing the page, in swapoff it may
be a little more complicate (we may need to grab the pagecache_lock
to ensure nobody start using the page while we clear it). And swapoff
is probably racy anyways as usual (swapoff in 2.2 is racy too). In
short I didn't focused on swapoff yet, I just made an hack to make it
to work while shutting down the machine so far.
:)
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/