The dummy emulator costs exactly 296 cycles (stable) on my
k6-2/450. It only adds 3 to eip then returns.
To check this, I compared 1 million iteriations of 10
consecutive cmove %eax,%eax with as much lea 0(%eax),%eax
(1 cycle, RAW dependancy, not parallelizable), and the
difference was exactly 660 ns/inst (297 cycles).
That said, I agree with you that it's worth optimizing a
bit, at least to stay closer to 300 cycles than to 450.
But that won't make emulated machines fast anyway.
One interesting note: I tested the prog on a VIA C3/533
Mhz. One native cmove %eax,%eax costs 56 cycles here ! (at
first, I even thought it was emulated). It's a shame to see
how these instructions have been implemented. May be they
flush the pipelines, write-backs, ... before the instruction.
BTW, cmov isn't reported in cpu_flags, perhaps to discourage
progs from using it ;-)
I will recode the stuff, and add two preventive messages:
- at boot time : "warning: this kernel may emulate unsupported instructions. If you
find it slow, please do dmesg."
- at first emulation : "trap caught for instruction XXX, program XXX."
Cheers,
Willy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/