.. and it probably serializes the instruction stream.
Look at the patch.
The _only_ thing it does for the system call path is:
- remove the "cli"
- change
cmpl $0,..
jne
cmp $0,..
jne
into
movl ..,reg
testl reg,reg
jne
and the latter may be worth a cycle (or two, if the CPU happens to like
the second form better for some other reason), but it's certainly not
noticeable.
A 3.4% improvement is equivalent to something like 9 cycles, so the "cli"
being faster on Athlon than on a PIII certainly explains why it's less
noticeable on the Athlon, but it still makes me suspect that the _real_
cost of the cli is on the order of 8 cycles.
It should be eminently testable. Just remove the cli from the standard
kernel, and do before-and-after tests.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/