Re: Hardware testing [was Re: VIA Southbridge bug (Was: Crash on boot (2.4.5))]

Vojtech Pavlik (vojtech@suse.cz)
Wed, 11 Jul 2001 11:11:59 +0200


On Tue, Jul 10, 2001 at 11:28:25AM -0400, Rob Landley wrote:
> On Tuesday 10 July 2001 05:17, Ville Herva wrote:
> > On Mon, Jul 09, 2001 at 12:48:59PM -0400, you [Rob Landley] claimed:
> > > (P.S. What kind of CPU load is most likely to send a processor into
> > > overheat? (Other than "a tight loop", thanks. I mean what kind of
> > > instructions?) This is going to be CPU specific, isn't it? Our would a
> > > general instruction mix that doesn't call halt be enough? It would need
> > > to keep the FPU busy too, wouldn't it? And maybe handle interrupts.
> > > Hmmm...)
> >
> > See Robert Redelmeier's cpuburn:
> >
> > http://users.ev1.net/~redelm/
>
> Cool. If nothing else, this is a much better starting point for further work
> than starting from scratch...
>
> > It is coded is assembly specificly to heat the CPU as much as possible. See
> > the README for details, but it seems that floating point operations are
> > tougher than integers and MMX can be even harder (depending on CPU model,
> > of course). Not sure what kind of role SSE, SSE2, 3dNow! play these days.
> > Perhaps Alan knows?
>
> There's at least three seperate things that need testing here. memtest86
> tests whether your memory is OK. CPUburn seems to do a good job testing
> processor heat (not that I'm running it on my laptop, which doesn't seem to
> have a thermal readout thingy anyway...)
>
> The third thing (which started this thread) was memory bus. The new 3DNow
> optimizations drove a memory bus into failure, and that IS processor
> specific...

Don't forget the L1/L2/L3 caches. I had once a mainboard with a faulty
L2 cache chip ('twas a K6-3 CPU, plus a FIC VA-503+ mainboard). No memory
or CPU test found the failure, yet kernel compliation was still crashing
after 6-8 hours.

I modified the 'memtest.c' little proggy (not the big memtest86, just a
little utility that runs under Linux), to use patterns and test size
that tests the L1 and then L2, and the error has shown after ten seconds
of running the test.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/