the problem is that we want to catch all COW events of the virtual
address, *and* we want to have a correct (physpage,offset) futex hash.
if we do the following:
q.page = NULL;
attach_vcache(&q.vcache, uaddr, current->mm, futex_vcache_callback);
page = pin_page(uaddr - offset);
ret = IS_ERR(page);
if (ret)
goto out;
head = hash_futex(page, offset);
(*)
set_current_state(TASK_INTERRUPTIBLE);
init_waitqueue_head(&q.waiters);
add_wait_queue(&q.waiters, &wait);
queue_me(head, &q, page, offset, -1, NULL, uaddr);
then we have the following race at (*): the futex is not yet hashed, but
it's already in the vcache hash. Nevertheless a parallel COW only
generates a useless callback - the futex is not hashed yet. The effect is
that the above 'page' address is in fact the old COW page [after the race
only mapped into some other MM], and the queue_me() futex hashing uses the
old page.
the only way i can see to do this race-less is what i suggested two mails
ago: when hashing the futex we have to take the pagetable lock and have to
check the actual current page that is mapped.
in fact there's another race as well: what if a parallel thread COWs the
page, faults it in (thus creates a new physical page), which then kswapd
swaps out. Ie. purely checking the pte (and using any potential new page
for the hash) is not enough, we really have to re-try the pin_pages() call
if the pte has changed meanwhile.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/