It seems there is a potential race caused by swap changes. The reason is
that we do not increase the swap entry on swapin readahead. The comment on
top of swap_duplicate() in read_swap_cache_async() says:
/*
* Make sure the swap entry is still in use. It could have gone
* while caller waited for BKL, or while allocating page above,
* or while allocating page in prior call via swapin_readahead.
*/
if (!swap_duplicate(entry)) /* Account for the swap cache */
goto out_free_page;
The BLK protects the logic against concurrent read_swap_cache_async()
calls, but it does not protect get_swap_page() in try_to_swap_out().
I do not see what protects us (increasing the swap map entry on
valid_swaphandles on older kernels used to be the protection) against the
following race:
- swapin_readahead() finds used entry on swap map. (valid_swaphandles)
- user of this entry deletes the swap map entry, so it becomes free. Then:
CPU0 CPU1
read_swap_cache_async() try_to_swap_out()
Second __find_get_page() fails
get_swap_page() returns swap
entry which CPU0 is trying to read
from.
swap_duplicate() for the entry
succeeds: CPU1 just allocated it.
add_to_swap_cache() add_to_swap_cache()
Now we got two pages on the hash tables for the "same" data. From this
point on there is no guarantee _which_ data will be returned when searched
via pagecache lookup.
Linus, Hugh ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/