The above is NOT how the page cache works. Or if some part of the page
cache works that way, then it is a BUG. You must NEVER allow multiple
outstanding reads from the same location - that implies that you're doing
something wrong, and the system is doing too much IO.
The way _all_ parts of the page cache should work is:
Create new page:
- look up page. If found, return it
- allocate new page.
- look up page again, in case somebody else added it while we allocated
it.
- add the page atomically with the lookup if the lookup failed, otherwise
just free the page without doing anything.
- return the looked-up / allocated page.
return up-to-date page:
- call the above to get a page cache page.
- if uptodate, return
- lock_page()
- if now uptodate (ie somebody else filled it and held the lock), unlock
and return.
- start the IO
- wait on IO by waiting on the page (modulo other work that you could do
in the background).
- if the page is still not up-to-date after we tried to read it, we got
an IO error. Return error.
The above is how it is always meant to work. The above works for both new
allocations and for old. It works even if an earlier read had failed (due
to wrong permissions for example - think about NFS page caches where some
people may be unable to actually fill a page, so that you need to re-try
on failure). The above is how the regular read/write paths work (modulo
bugs). And it's also how the swap cache should work.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/