5 samples is about the minimum where you can compute the confidence
intervals and trust them to reasonable extent. We see, that it's enough
samples to compare to the -mm2. But we can't say anything about gcc32
compiled version.
> Yes, if the differences are small then a few extra runs may be needed
> to drill down into the finer margins. The tester should be able to
> judge that during the test. You get a feel for these things.
>
> I believe that your time would be better spent developing and incorporating
> more tests (wider coverage) than worrying about super-high accuracy.
Accuracy is increased by just geting more samples. More tests are of
course important, but each must be run enough times so the results are
statisticaly significant.
> (And if there's more than a 1% variation between same kernel, compiled
> with different compilers then the test is bust. Kernel CPU time is
> dominated by cache misses and runtime is dominated by IO wait.
> Quality of code generation is of tiny significance)
So we will need a lot runs to see if there is a difference...
-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <bulb@ucw.cz>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/