Mike Gerdts wrote:

For more of a devil's advocate view, take a look at this research
about "Failure Oblivious Computing" at http://lwn.net/Articles/188059/
and http://www.usenix.org/events/osdi04/tech/rinard.html.

Mike


A unexpected NULL pointer is an indication that something is seriously
wrong, somewhere, and the programmer's assumptions were incorrect.

For conventional programming, this should result in an immediate
crash.  We don't want to continue execution, delivering possibly
wrong answers to people or persistent storage after such a failure
has occurred, because giving someone a silently wrong answer is
usually far worse that simply crashing.

The Solaris kernel crashes/panics when something goes wrong
and (usually) leaves a crash dump for post-mortem analysis.
The machine then reboots and continues.

Clearly, systems that are non-redundant and life-critical
(pacemakers, etc) need to have a much more sophisticated
error handling strategy - but I'm pretty sure they rely
on rapid restart from known conditions (eg reboot).

- Bart


--
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to