On Tuesday, April 25, 2017 at 7:52:25 PM UTC-7, Dave Cheney wrote: > > > Yes, and then crashes the program. In the scenario I described, with > thousands of other requests in flight that meet an abrubt end. That could > be incredibly costly, even if it's been planned for > > There are a host of other reasons that can take a server offline abruptly. > It seems like a odd misallocation of resources to try to prevent one > specific case - a goroutine panics due to a programming error or input > validation failure -- both which are far better addressed with testing. > There's a cost benefit analysis to be done, for sure, but I don't always believe it to be a misallocation of resources. I don't believe it's costly for every program, and for programs where it's important, I don't believe it to always be a hard problem to accomplish. To your point, for a great many programs, the effort probably isn't worth the reward.
> To try to postpone the exit of a program after a critical error to me > implies a much more complex testing and validation process that has > identified all the shared state in the program and verified that it is > correct in the case that a panic is caught. > Not always applicable, but there are some relatively easy ways of coping with that: - Don't have shared state to begin with (for a large number of programs, this isn't that hard! Look at how far php has gotten, for example) - Don't have mutable shared state - Copy on write, and only publish immutable shared state Those properties can also make testing and validation much easier, I should note. And with those properties, I don't think it's necessarily hard to isolate a particular lifecycle, for example, an http request. Often it can just be a http handler that defers a recover and calls a real handler. In the case of publishing an immutable object graph to shared state, only publish it once it's verified. If a panic occurs in whatever publishing goroutine, published state remains in a known-good condition. Of course, it's very possible to imagine a program that is complex enough where shared state isn't simple to manage. I would also argue, independently on if it's worth any effort to make a single lifecycle crash-safe, that as a program reaches that level of complexity, it should be questioned if all of that state belongs in the same process at all. Split it up and get process isolation from the operating system (and scale that up to multiple machines as well, to your third point). To me it seems simpler and more likely to have the root cause of the panic > addressed to just let the program crash. The alternative, somehow > firewalling the crash, and its effects on the internal state of your > program, sounds unworkably optimistic. > I'm by no means advocating for leaving a fault in a program. I don't believe these are alternatives at all! Fix your program! But I certainly don't think resiliency within a process space is always unworkable. Perhaps optimistic, I'll give you that :) -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.