Hi Rob,
I fully support the idea to audit the Linux device drivers - using guidelines,
hardware fault injection, stress testing etc - and fixing any potential bugs.
This is obviously a very important task, because the drivers are some of the
most ugly code I've seen in the kernel.
"Pro-active monitoring", ie by basically gathering whatever statistics are
available and feeding them to some sort of user-space application and then
trying to deduce a potential failure is also a very valuable goal; so exposing
more statistics seems definetely good, too. As long as that doesn't introduce
even more errors...
Any help you can offer on the above is surely appreciated by all involved and
will have a direct, positive impact on Linux.
That said, and the fluff in your specification aside (which was very likely
necessary for management ;-), your spec certainly contains some good points on
how to write stable and robust code. (Aside from the comments the others have
raised already regarding event logging and that of course all recommendations
need to be thoughtfully applied to the case in question)
The statistics can best be exposed via driverfs or /proc (for kernels which
don't have driverfs); however, the statistics analyser nor the SNMP agent
pre-processing belong into the kernel itself. Keep the drivers as lean as
possible, that will introduce less errors at this level. I object to the CSM
being in kernel space. Having a more or less common API for the statistics to
be gathered and exposed by the drivers would be highly valuable indeed though.
What are your further timelines?
A lot of the above - ie, audit and test current drivers - can be done without
(at least not with much more) further planning; I'm always rather amazed at
how much effort Intel, IBM and their child OSDL spent on pretty specifications
which could also be applied to real work ;-)
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
-- Principal Squirrel Research and Development, SuSE Linux AG ``Immortality is an adequate definition of high availability for me.'' --- Gregory F. Pfister- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/