Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

David Howells (dhowells@warthog.cambridge.redhat.com)
Tue, 24 Apr 2001 11:33:13 +0100


> I see what you meant here and no, I'm not lucky, I thought about that. gcc x
> 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the
> sem is not a constant (I'm not clobbering another register, if it does it's
> stupid and I consider this a compiler mistake).

It is a compiler mistake... the compiler clobbers another register for
you. The compiler does not, however, know about timing issues with the
contents of the inline assembly... otherwise it'd stick a delay in front of
the XADD in my stuff.

> I tried with a variable pointer and gcc as I expected generated the (%%eax)
> but instead when it's a constant like in the bench my way it avoids to stall
> the pipeline by using the constant address for the locked incl, exactly as
> you said and that's probably why I beat you on the down read fast path too.
> (I also benchmarked with a variable semaphore and it was running a little
> slower)

*grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come
out differently.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/