Re: What's left over.

Werner Almesberger (wa@almesberger.net)
Thu, 31 Oct 2002 19:37:05 -0300


Jeff Garzik wrote:
> That said, I used to be an LKCD cheerleader until a couple people made
> some good points to me: it is not nearly low-level enough to truly be
> of use in crash situations.

I'm not so convinced about this. I like the Mission Critical
approach: save the dump to memory, then either boot through the
firmware or through bootimg (nowadays, that would be kexec),
then retrieve the dump from memory, and do whatever you like
with it.

The huge advantage here is that you don't need a ton of
specialized dump drivers and/or have much of the original kernel
infrastructure to be in a usable state. The rebooted system will
typically be stable enough to offer the full range of utilities,
including up to date drivers for all possible devices, so you
can safely write to disk, scp all the mess to your support
critter, or post an automatic flame to linux-kernel :-)

The weak points of the Mission Critical design are that early
memory allocation in the kernel needs to be tightly controlled,
that architectures that wipe CPU caches on reboot need to
commit them to memory before the firmware restart, and that
drivers need to be able to recover from an "unclean" hardware
state. (I think we'll see much of the latter happen as kexec
advances. The other two issues aren't really special.)

Actually, at the RAS BOF I thought that IBM were developing LKCD
in this direction, and had also eliminated a few not so elegant
choices of Mission Critical's original design. I haven't looked
at the LKCD code, but the descriptions sound as if all the
special-case cruft seems to be back again, which I would find a
little disappointing.

There might be a case for specialized low-overhead dump handlers
for small embedded systems and such, but they're probably better
maintained outside of the mainstream kernel. (They're more like
firmware anyway.)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/