[go-nuts] Re: Recover considered harmful

Chris G Tue, 25 Apr 2017 21:25:07 -0700


On Tuesday, April 25, 2017 at 7:52:25 PM UTC-7, Dave Cheney wrote:
>
> > Yes, and then crashes the program. In the scenario I described, with 
> thousands of other requests in flight that meet an abrubt end.  That could 
> be incredibly costly, even if it's been planned for
>
> There are a host of other reasons that can take a server offline abruptly. 
> It seems like a odd misallocation of resources to try to prevent one 
> specific case - a goroutine panics due to a programming error or input 
> validation failure -- both which are far better addressed with testing.
>
There's a cost benefit analysis to be done, for sure, but I don't always 
believe it to be a misallocation of resources.  I don't believe it's costly 
for every program, and for programs where it's important, I don't believe 
it to always be a hard problem to accomplish.  To your point, for a great 
many programs, the effort probably isn't worth the reward.


> To try to postpone the exit of a program after a critical error to me 
> implies a much more complex testing and validation process that has 
> identified all the shared state in the program and verified that it is 
> correct in the case that a panic is caught.
>
Not always applicable, but there are some relatively easy ways of coping 
with that:
- Don't have shared state to begin with (for a large number of programs, 
this isn't that hard! Look at how far php has gotten, for example)
- Don't have mutable shared state
- Copy on write, and only publish immutable shared state

Those properties can also make testing and validation much easier, I should 
note. And with those properties, I don't think it's necessarily hard to 
isolate a particular lifecycle, for example, an http request. 

Often it can just be a http handler that defers a recover and calls a real 
handler.  In the case of publishing an immutable object graph to shared 
state, only publish it once it's verified.  If a panic occurs in whatever 
publishing goroutine, published state remains in a known-good condition.

Of course, it's very possible to imagine a program that is complex enough 
where shared state isn't simple to manage. I would also argue, 
independently on if it's worth any effort to make a single lifecycle 
crash-safe, that as a program reaches that level of complexity, it should 
be questioned if all of that state belongs in the same process at all.   
Split it up and get process isolation from the operating system (and scale 
that up to multiple machines as well, to your third point).

To me it seems simpler and more likely to have the root cause of the panic 
> addressed to just let the program crash. The alternative, somehow 
> firewalling the crash, and its effects on the internal state of your 
> program, sounds unworkably optimistic.
>

I'm by no means advocating for leaving a fault in a program. I don't 
believe these are alternatives at all! Fix your program!  But I certainly 
don't think resiliency within a process space is always unworkable. 
 Perhaps optimistic, I'll give you that :)

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[go-nuts] Re: Recover considered harmful

Reply via email to