Re: [patch] swap-speedup-2.4.3-A1, massive swapping speedup

Marcelo Tosatti (marcelo@conectiva.com.br)
Mon, 23 Apr 2001 19:13:15 -0300 (BRT)


On Mon, 23 Apr 2001, Linus Torvalds wrote:

>
> On Mon, 23 Apr 2001, Jonathan Morton wrote:
> > >There seems to be one more reason, take a look at the function
> > >read_swap_cache_async() in swap_state.c, around line 240:
> > >
> > > /*
> > > * Add it to the swap cache and read its contents.
> > > */
> > > lock_page(new_page);
> > > add_to_swap_cache(new_page, entry);
> > > rw_swap_page(READ, new_page, wait);
> > > return new_page;
> > >
> > >Here we add an "empty" page to the swap cache and use the
> > >page lock to protect people from reading this non-up-to-date
> > >page.
> >
> > How about reversing the order of the calls - ie. add the page to the cache
> > only when it's been filled? That would fix the race.
>
> No. The page cache is used as the IO synchronization point, both for
> swapping and for regular IO. You have to add the page in _first_, because
> otherwise you may end up doing multiple IO's from different pages.
>
> The proper fix is to do the equivalent of this on all the lookup paths
> that want a valid page:
>
> if (!PageUptodate(page)) {
> lock_page(page);
> if (PageUptodate(page)) {
> unlock_page(page);
> return 0;
> }
> rw_swap_page(page, 0);
> wait_on_page(page);
> if (!PageUptodate(page))
> return -EIO;
> }
> return 0;
>
> This is the whole point of the "uptodate" flag, and for all I know we may
> already do all of this (it's certainly the normal setup).
>
> Note how we do NOT block on write-backs in the above: the page will be
> up-to-date from the bery beginning (it had _better_ be, it's hard to write
> back a swap page that isn't up-to-date ;).
>
> The above is how all the file paths work.

Please don't forget about swapin readahead.

If the page is not uptodated and we are doing swapin readahead on it, we
want to fail if the page is already locked (which means its already under
IO).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/