> > > If you think about who is going to remove the page from the lru you'll
> > > see the problem.
> >
> > Nope, still don't see it. Whoever hits put_page_testzero frees the page,
> > secure in the knowlege that there are no other references to it.
>
> Well yes, but we cannot remove the page from the lru atomatically
> at page_cache_release time if we follow your proposal. If you think we can,
> show me your implementation of page_cache_release and I'll show
> you where the races are (unless you do everything under the lru_lock
> of course).
void page_cache_release(struct page *page)
{
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) {
__lru_cache_del(page);
atomic_dec(&page->count);
}
spin_unlock(&pagemap_lru_lock);
if (put_page_testzero(page))
__free_pages_ok(page, 0);
}
This allows the following benign race, with initial page count = 3:
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) /* false */
spin_unlock(&pagemap_lru_lock);
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) /* false */
spin_unlock(&pagemap_lru_lock);
if (put_page_testzero(page))
__free_pages_ok(page, 0);
if (put_page_testzero(page))
__free_pages_ok(page, 0);
Neither holder of a page reference sees the count at 2, and so the page
is left on the lru with count = 1. This won't happen often and such
pages will be recovered from the cold end of the list in due course.
The important question is: can this code ever remove a page from the lru
erroneously, leaving somebody holding a reference to a non-lru page? In
other words, can the test PageLRU(page) && page_count(page) == 2 return
a false positive? Well, when this test is true we can account for both
both references: the one we own, and the one the lru list owns. Since
we hold the lru lock, the latter won't change. Nobody else has the
right to increment the page count, since they must inherit that right
from somebody who holds a reference, and there are none.
We could also do this:
void page_cache_release(struct page *page)
{
if (page_count(page) == 2) {
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) {
__lru_cache_del(page);
atomic_dec(&page->count);
}
spin_unlock(&pagemap_lru_lock);
}
if (put_page_testzero(page))
__free_pages_ok(page, 0);
}
Which avoids taking the lru lock sometimes in exchange for widening the
hole through which pages can end up with count = 1 on the lru list.
Let's run this through your race detector and see what happens.
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/