I didn't benchmark it, but as a data point ICC 7 generates the movls instead
of pushes now too, (even though it generates bigger code). In fact it is even more
aggressive on that than gcc: gcc does it only for more than three or four registers,
icc does it for two and more. So I expect it being faster on Intel CPUs - at least on
the P4 - too. I doubt they tuned it for Athlons.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/