Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

Andrea Arcangeli (andrea@suse.de)
Tue, 24 Apr 2001 12:17:47 +0200


On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote:
> | + : "+m" (sem->count), "+a" (sem)
^^^^^^^^^^ I think you were comenting on
the +m not +a ok
>
> >From what I've been told, you're lucky here... you avoid a pipeline stall

I see what you meant here and no, I'm not lucky, I thought about that. gcc
2.95.* seems smart enough to produce (%%eax) that you hardcoded when the sem is
not a constant (I'm not clobbering another register, if it does it's stupid and
I consider this a compiler mistake). I tried with a variable pointer and gcc
as I expected generated the (%%eax) but instead when it's a constant like in
the bench my way it avoids to stall the pipeline by using the constant address
for the locked incl, exactly as you said and that's probably why I beat you on
the down read fast path too. (I also benchmarked with a variable semaphore and
it was running a little slower)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/