Scrubbing has nothing whatever to do with reporting of correctable errors to
the CPU, even if it does the scrubbing.
Scrubbing does not happen on the basis of chance detection of correctable
errors from normal activity, because that would sometimes be too late.
Remember, the hardware only finds out about an error when the word is
accessed. There is no detection of the bit cell getting its charge altered,
and the errors are cumulative between corrections.
Scrubbing is intended to lower the probability that any given memory word
will be hit by a second error causing event (such as an alpha particle
emitted from a ceramic case) without having been accessed and corrected. The
scrub just continuously rolls through all of physical memory (at low
priority) again and again doing whatever level of access is necessary to
cause correction. This limits the maximum time between correction of any
memory word. Some memory systems automatically correct and rewrite
(atomically) on a read of a word with a single bit error. Some mainframe
memory systems do the whole ECC scrub/correction operation in hardware,
simultaneously in each bank.
The primary benefit of logging is to catch deteriorating memory cells during
periodic maintenance that either do not correct at all (single stuck bit,
single hits become uncorrectable) or that repeatedly fail over time, perhaps
due to charge leaks from long term diffusion of contaminants.
Cheers,
Ed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/