Re: Locking comment on shrink_caches()

Linus Torvalds (torvalds@transmeta.com)
Wed, 26 Sep 2001 19:12:54 +0000 (UTC)


In article <Pine.LNX.4.30.0109262036480.8655-100000@Appserv.suse.de>,
Dave Jones <davej@suse.de> wrote:
>On Wed, 26 Sep 2001, Linus Torvalds wrote:
>
>> > > cpuid: 72 cycles
>> > cpuid: 79 cycles
>> > Only slightly worse, but I'd not expected this.
>> That difference can easily be explained by the compiler and options.
>
>Actually repeated runs of the test on that box show it deviating by up
>to 10 cycles, making it match the results that Alan posted.
>-O2 made no difference, these deviations still occur. They seem more
>prominent on the C3 than other boxes I've tried, even with the same
>compiler toolchain.

Does the C3 do any kind of frequency shifting?

For example, on a transmeta CPU, the TSC will run at a constant
"nominal" speed (the highest the CPU can go), although the real CPU
speed will depend on the load of the machine and temperature etc. So on
a crusoe CPU you'll see varying speeds (and it depends on the speed
grade, because that in turn depends on how many longrun steps are being
actively used).

For example, on a mostly idle machine I get

torvalds@kiwi:~ > ./a.out
nothing: 54 cycles
locked add: 54 cycles
cpuid: 91 cycles

while if I have another window that does an endless loop to keep the CPU
busy, the _real_ frequency of the CPU scales up, and the machine
basically becomes faster:

torvalds@kiwi:~ > ./a.out
nothing: 36 cycles
locked add: 36 cycles
cpuid: 54 cycles

(The reason why the "nothing" TSC read is expensive on crusoe is because
of the scaling of the TSC - rdtsc literally has to do a floating point
multiply-add to scale the clock to the right "nominal" frequency. Of
course, "expensive" is still a lot less than the inexplicable 80 cycles
on a P4).

(That's a 600MHz part going down to to 400MHz in idle, btw)

On a 633MHz part (I don't actually have access to any of the high speed
grades ;) it ends up being

fast:
nothing: 39 cycles
locked add: 40 cycles
cpuid: 68 cycles

slow:
nothing: 82 cycles
locked add: 84 cycles
cpuid: 122 cycles

which corresponds to a 633MHz part going down to 300MHz in idle.

And of course, you can get pretty much anything in between, depending on
what the load is...

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/