Re: [linux-usb-devel] Re: [BK PATCH] USB changes for 2.5.34

Daniel Phillips (phillips@arcor.de)
Sun, 15 Sep 2002 07:01:44 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Kevin Carpenter: "Think 2.4.10 broke my PCI subsystem."
Previous message: Shawn Starr: "Re: ANNOUNCEMENT: 2.4.20-pre7-rmap14a-xfs-uml-shawn12d released - dont use 12c"

On Tuesday 10 September 2002 18:51, Linus Torvalds wrote:
> On Tue, 10 Sep 2002, David Brownell wrote:
> > Or in even shorter sound bite format: "Just say no to BUG()s."
>
> Well, the thing is, BUG() _is_ sometimes useful. It's a dense and very
> convenient way to say that something catastrophic happened.

There's an important case you're overlooking that takes us out of the
"sometimes" zone, and that is where you want to load up some piece of
heavily-context-dependent code with assertions, just to have confidence
that the many assumptions actually hold. And once you have those hooks
in the code, it often makes little sense to take them out, because
they'll just have to go back in again the next time some remote part
of the kernel violates one of the assumptions.

What you *want* to do, is just turn them off, not remove them. (Sure,
there are lots that shouldn't survive the alpha version of any code,
but still lots of good ones that should be kept, just stubbed out.)

So this is just a name problem: define MYSUBSYSTEM_BUG_ON which is nil
in production, but equivalent to BUG_ON for alpha or beta code. In the
former case it means the code is going to run a little faster, plus the
system is going to be a little more resilient, as you say.

Otherwise, completely agreed.

> And actually, outside of drivers and filesystems you can often know (or
> control) the number of locks the surrounding code is holding, and then a
> BUG() may not be as lethal. At which point the normal "oops and kill the
> process" action is clearly fine - the machine is still perfectly usable.

Eventually we could try some fancy trick that involves keeping track
of the locks in a systematic way, so that they can be analyzed
automatically by a tool that generates lock-breaking code. Then a
subsystem could use the generated code in its error path. Sure, it's a
lot of work just to add another "9" to reliability, but when was anything
ever easy?

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Kevin Carpenter: "Think 2.4.10 broke my PCI subsystem."
Previous message: Shawn Starr: "Re: ANNOUNCEMENT: 2.4.20-pre7-rmap14a-xfs-uml-shawn12d released - dont use 12c"