Re: Measuring execution time

Jakob Østergaard (jakob@unthought.net)
Wed, 16 Jan 2002 22:47:19 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Richard B. Johnson: "Re: floating point exception"
Next message: David Engebretsen: "[PATCH] pcnet32"
Previous message: Mark Zealey: "Re: floating point exception"

On Wed, Jan 16, 2002 at 12:23:33PM -0500, Chris Friesen wrote:
> Mark Cuss wrote:
>
> > I am working on optimizing some software and would like to be able to
> > measure how long an instruction takes (down to the clock cycle of the CPU).
> > I recall reading somewhere about a kernel time measurement called a "Jiffy"
> > and figured that it would probably apply to this.
> >
> > If anyone has any tips on how to figure out how to do this I'd really
> > appreciate it.
>
> Jiffies are quite coarse-grained. On x86 you want the rdtsc instruction, while
> on ppc you want mfrtcu/mfrtcl or mftbu/mftb depending on the version of the
> chip. These are used as inline assembly, and if you do a google search you
> should be able to find code snippets.

However, rdtsc will completely change how your decoders fill, how busy your
execution units are, it will interfere with every singe stage in the CPU
from the fetch logic to the retirement and write buffer.

Your cycle will not behave "as usual" if you put rdtscs around it.

I usually put an rdtsc in the very beginning of a routine, and an rdtsc
at the end. Then I have the assembly following the last rdtsc increment
a counter in an array (after a bounds check). The index in the array
is the number of clock cycles I spent.

After running the function some thousands of times, you will have a nice
histogram in your array, usually with some skewed bell-like curve (you can
see many interesting kinds of double/triple bells etc. all depending on
your code).

Changing just one or two instructions in the function, then re-running the
code and re-plotting the histogram, will displace the peak of the bell curve
in some direction. This will tell you how your change affected the code
in "typical number of clock cycles".

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Richard B. Johnson: "Re: floating point exception"
Next message: David Engebretsen: "[PATCH] pcnet32"
Previous message: Mark Zealey: "Re: floating point exception"