Re: [BENCHMARK] Corrected gcc3.2 v gcc2.95.3 contest results

Andrew Morton (akpm@digeo.com)
Mon, 23 Sep 2002 20:01:20 -0700


Con Kolivas wrote:
>
> ...
> n=5 for number of samples
>
> Kernel Mean CI(95%)
> 2.5.38 411 344-477
> 2.5.39-gcc32 371 224-519
> 2.5.38-mm2 95 84-105
>
> The mean is a simple average of the results, and the CI(95%) are the 95%
> confidence intervals the mean lies between those numbers. These numbers seem to
> be the most useful for comparison.
>
> Comparing 2.5.38(gcc2.95.3) with 2.5.38(gcc3.2) there is NO significant
> difference (p 0.56)
>
> Comparing 2.5.38 with 2.5.38-mm2 there is a significant diffence (p<0.001)
> [SNIP...]
>
> when I've run dozens of tests previously on the same kernel I've found that even
> with a mean of 400 rarely a value of 80 will come up. Clearly this lowest score
> does not give us the information we need.
>

I think this is really going way too far. I mean, the datum which
we take away from the above result is that 2.5.38 sucks. No more
accuracy is required.

Yes, if the differences are small then a few extra runs may be needed
to drill down into the finer margins. The tester should be able to
judge that during the test. You get a feel for these things.

I believe that your time would be better spent developing and incorporating
more tests (wider coverage) than worrying about super-high accuracy.

(And if there's more than a 1% variation between same kernel, compiled
with different compilers then the test is bust. Kernel CPU time is
dominated by cache misses and runtime is dominated by IO wait.
Quality of code generation is of tiny significance)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/