It is a compiler mistake... the compiler clobbers another register for
you. The compiler does not, however, know about timing issues with the
contents of the inline assembly... otherwise it'd stick a delay in front of
the XADD in my stuff.
> I tried with a variable pointer and gcc as I expected generated the (%%eax)
> but instead when it's a constant like in the bench my way it avoids to stall
> the pipeline by using the constant address for the locked incl, exactly as
> you said and that's probably why I beat you on the down read fast path too.
> (I also benchmarked with a variable semaphore and it was running a little
> slower)
*grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come
out differently.
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/