Re: [patch] 'virtual => physical page mapping cache', vcache-2.5.38-B8

Ingo Molnar (mingo@elte.hu)
Fri, 27 Sep 2002 19:33:34 +0200 (CEST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Manfred Spraul: "Re: [patch 3/4] slab reclaim balancing"
Previous message: James Bottomley: "Re: Warning - running *really* short on DMA buffers while doing file"

thinking about it some more, first attaching the vcache does not help.

the problem is that we want to catch all COW events of the virtual
address, *and* we want to have a correct (physpage,offset) futex hash.

if we do the following:

q.page = NULL;
attach_vcache(&q.vcache, uaddr, current->mm, futex_vcache_callback);

page = pin_page(uaddr - offset);
ret = IS_ERR(page);
if (ret)
goto out;
head = hash_futex(page, offset);
(*)
set_current_state(TASK_INTERRUPTIBLE);
init_waitqueue_head(&q.waiters);
add_wait_queue(&q.waiters, &wait);
queue_me(head, &q, page, offset, -1, NULL, uaddr);

then we have the following race at (*): the futex is not yet hashed, but
it's already in the vcache hash. Nevertheless a parallel COW only
generates a useless callback - the futex is not hashed yet. The effect is
that the above 'page' address is in fact the old COW page [after the race
only mapped into some other MM], and the queue_me() futex hashing uses the
old page.

the only way i can see to do this race-less is what i suggested two mails
ago: when hashing the futex we have to take the pagetable lock and have to
check the actual current page that is mapped.

in fact there's another race as well: what if a parallel thread COWs the
page, faults it in (thus creates a new physical page), which then kswapd
swaps out. Ie. purely checking the pte (and using any potential new page
for the hash) is not enough, we really have to re-try the pin_pages() call
if the pte has changed meanwhile.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Manfred Spraul: "Re: [patch 3/4] slab reclaim balancing"
Previous message: James Bottomley: "Re: Warning - running *really* short on DMA buffers while doing file"