Specifically, the 30% comes in two places. Using Intel proprietary
benchmarks (unreleased, according to the footnotes) they find that a
typical IA32 instruction mix uses some 35% of system resources in an
advanced device like the P4 with NetBurst. the rest is idle.
by using the SMT model with two virtual systems - each with complete
register sets and independent APICs, sharing only the backend exec
units - they claim you get a 30% improvement in wall-clock time. This
is supposed to be on their benchmarks *without* recompiling anything. To
get "additional" improvement, using code to take advantage of the dual
virtual CPUs nature of the chip and recompiling should give some
unquantified gain.
-josh
to help your searching if you want more details, Intel has called this:
Jackson Technology aka Project Foster aka Hyper-Threading Technology
and is known in the rest of the world as SMT.
Intel has a whitepaper or two available for download. If you can't find
them at developer.intel.com or via Google, let me know and I've got some
copies laying around. Amusingly, they seem to be ultra scared of
releasing any real information about it. Alpha was working on a 4-way
design that seemed a bit more clever for the 21464, which appears to be
destined for the bit bucket now :(
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/