agreed.
> The mmap semaphore is a read-write semaphore, and it _is_ permissible to
> call "copy_to_user()" and friends while holding the read lock.
>
> The bug appears to be in the implementation of the write semaphore -
> down_write() doesn't undestand that blocked writes must not block new
> readers, exactly because of this situation.
Exactly, same reason for which we need the same property from the rw
spinlocks (to be allowed to read_lock without clearing irqs). Thanks so
much for reminding me about this! Unfortunately my rwsemaphores are
blocking readers at the first down_write (for the better fairness
property issuse, but I obviously forgotten that doing so I would
introduce such a deadlock). The fix is a few liner for my
implementation, here it is:
--- 2.4.10pre10aa2/lib/rwsem_spinlock.c.~1~ Mon Sep 17 19:17:24 2001
+++ 2.4.10pre10aa2/lib/rwsem_spinlock.c Tue Sep 18 01:59:06 2001
@@ -73,11 +73,13 @@
void down_read(struct rw_semaphore *sem)
{
+ int count;
CHECK_MAGIC(sem->__magic);
spin_lock(&sem->lock);
+ count = sem->count;
sem->count += RWSEM_READ_BIAS;
- if (__builtin_expect(sem->count, 0) < 0)
+ if (__builtin_expect(count < 0 && !(count & RWSEM_READ_MASK), 0))
rwsem_down_failed(sem, RWSEM_READ_BLOCKING_BIAS);
spin_unlock(&sem->lock);
}
it will be applied to next -aa. For the mainline semaphores I assume
David will take care of that.
For the record, I'm using spinlock based rwsemphores. Last time I
checked my asm semaphores I found a small race in up_write, I didn't
checked if the mainlines semaphores were affected too but I just
preferred to stay safe with the spinlock in the meantime (in the
microbenchmark the spinlock based rwsems weren't that much slower
[and my optimized version is much faster than the mainline spinlock
based rwsem] so using asm it's not a noticeable improvement in the macro
real life benchmarks and the robustness of the spinlock is quite
unvaluable, even more now that allowed me to do a bugfix without
panicing in doing those changes). I think I will return to the asm
rwsem only after proofing my implementation with math or after writing
an automated simulation that checks their correctness in all possible
race combinations (assuming they're mutex and with a variable number of
threads).
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/