> I'm saying that we map in a page at a magic offset (just above the stack),
> and that page contains the locking code.
>
> For 386 CPU's (where only UP matters), we can trivially come up with a
> lock that doesn't use cmpxchg8b and that isn't SMP-safe. It might even go
> into the kernel every time if it has to - ie it _works_, it just isn't
> optimal.
>
Ahhhh, in this context, now "I see the light" (actually its dark at the east
coast).
So you envision this to go through some named "section" or do you
want to go through the futex_region() library call which identifies whether
the code page has been mapped. If not, the kernel then will provide the
locking code in that page dependent on the architecture (UP or SMP).
Fair enough.
> > Fail to see why that matters. User level locking is mostly beneficial on
> > SMPs.
>
> That's not the issue AT ALL.
>
> Semaphores are absolutely required on UP too, with threads. There is
> _zero_ difference between UP and SMP from a locking perspective in user
> space due to the fact that we can be preempted at any time - except from
> the cache coherency issue.
>
Agreed, my point was wrt providing the functionality only. Only difference
between UP and SMP would be that a spinning version would default to the
standard version (no spinning) under UP.
> > So, you lock the bus for the atomic update. This is UP, nothing's going
> > on on the bus anyway.
>
> That's not the point. Nobody has locked the bus in the last ten years: the
> cache coherency is done on a cacheline basis, not on the bus.
>
> The point being that the difference between a "decl" and a "lock ; decl"
> is about 1:12 or so in performance.
>
I am no expert in architecture, but if its done through the cache coherency
mechanism, the overhead shouldn't be 12:1. You simply mark the cache line as
part of you instruction to avoid a cache line transfer. How can that be 12
times slower. .. Ready to be educated....
> > Even if its a few more cycles, still beats the heck out of using other
> > heavyweight kernel APIs
>
> Sure it does. But if the speed of locking matters enough for user-level
> locks to matter, don't you think the 1:12 difference matters as well?
>
> Linus
Yipp, I buy that argument.
Overall, it just seems to me the user locking subsystem is becoming quickly
again a complicated beast.
Anyway, time to go home and play with the kids :-)
-- -- Hubertus Franke (frankeh@watson.ibm.com) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/