I see the light! Thank you.
> This my render code, implemented with posix threads, running on a dual
> P4-Xeon@1.8GHz.
> Number of threads Elapsed time User Time System Time
> 1 53:216 53:220 00:000
> 2 29:272 58:180 00:320
> 3 27:162 1:21:450 00:540
> 4 25:094 1:41:080 01:250
>
> Elapsed is measured by the parent thread, that is not doing anything
> but wait on a pthread_join. User and system times are the sum of
> times for all the children threads, that do real work.
>
> The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
> I have my cpus doubled but each one has half the pipelining for floating
> point...see the user cpu time increased due to 'worst' processors and
> cache pollution on each package.
heavily on the type of code being run. In other words, HT helps, but it is
*no* substitute for true multiple processors. And it is ONLY of value when
an SMP kernel is in use.
What you're seeing meshes with my results: our perfromance gains from HT are
about the same. HT didn't lose either of us anything, but it sure as heck
didn't make the kind of difference the hype seems to imply.
As for REAL SMP: I posted some more numbers on my web site (URL below),
using the same gcc compile test on my dual-proc with PIII-600s. Using a
single process, the compile took just under a 100 minutes, while with two
processes, it finished in 58.5 minutes. Real SMP reduced the time by 40%
(again, similar to your numbers).
..Scott
-- Scott Robert Ladd Coyote Gulch Productions, http://www.coyotegulch.com No ads -- just very free (and somewhat unusual) code.- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/