Re: MCE error

Dave Jones (davej@codemonkey.org.uk)
Mon, 31 Mar 2003 16:39:39 +0100


On Sun, Mar 30, 2003 at 10:47:56AM -0800, Lawrence Walton wrote:
> I just got a MCE error while running 2.5.65 "MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0.
> Bank 2: 940040000000017a" did a google search and found Dave Jones's parsemce, and decoded it to
>
> Status: (ba) Error IP valid
> Restart IP invalid.
>
> And was wondering what that actually meant. :)

Incomplete dump, what it really means..

(davej@deviant:davej)$ ./a.out -b 2 -e 0xba -s 940040000000017a -a 0
Status: (186) Error IP valid
Restart IP invalid.
parsebank(2): 940040000000017a @ 0
External tag parity error
Correctable ECC error
Address in addr register valid
Error enabled in control register
Memory heirarchy error
Request: Generic error
Transaction type : Generic
Memory/IO : I/O

Looks like the L2 cache ECC checking spotted something going wrong,
and fixed it up. This can happen in cases where there is inadequate
cooling, power, or overclocking (or in rare circumstances, flaky CPUs)

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/