I see what you meant here and no, I'm not lucky, I thought about that. gcc
2.95.* seems smart enough to produce (%%eax) that you hardcoded when the sem is
not a constant (I'm not clobbering another register, if it does it's stupid and
I consider this a compiler mistake). I tried with a variable pointer and gcc
as I expected generated the (%%eax) but instead when it's a constant like in
the bench my way it avoids to stall the pipeline by using the constant address
for the locked incl, exactly as you said and that's probably why I beat you on
the down read fast path too. (I also benchmarked with a variable semaphore and
it was running a little slower)
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/