On Wed, Apr 26, 2017 at 10:55 AM, Peter Herth <he...@peter-herth.de> wrote:

>
>
> On Wed, Apr 26, 2017 at 3:07 AM, Dave Cheney <d...@cheney.net> wrote:
>
>>
>>
>> On Wednesday, 26 April 2017 10:57:58 UTC+10, Chris G wrote:
>>>
>>> I think those are all excellent things to do. They do not preclude the
>>> use of recovering from a panic to assist (emphasis on assist - it is
>>> certainly no silver bullet) in achieving fault tolerance.
>>>
>>> Assuming a web service that needs to be highly available, crashing the
>>> entire process due to one misbehaved goroutine is irresponsible.  There can
>>> be thousands of other active requests in flight that could fail gracefully
>>> as well, or succeed at their task.
>>>
>>> In this scenario, I believe a well behaved program should
>>>
>>>    - clearly log all information about the fault
>>>
>>> panic does that
>>
>
> No, panic certainly does not do that. It prints the stack trace. A proper
> logger could add additional information about the program state at the
> point of the panic, which is not visible from the stack trace. It also
> might at least be reasonable to perform an auto-save before quitting.
>
> Same; relying on a malfunctioning program to report its failure is like
>> asking a sick human to perform their own surgery.
>>
>
> What makes you think that a panic implies that the whole program is
> malfunctioning?
>

But *that is not the claim*. The claim is, that if you discover a condition
which can uniquely be attributed to a code bug, you should always err on
the side of safety and prefer bailing out to continuing with a known-bad
program. It's not "as I see this bug, I know the rest of the program is
broken too", it's "as I see this bug, I can not pretend that it can't be".


> A panic should certainly taken seriously, and the computation in which it
> happened should be aborted. But if you think of a functional programming
> style
>

If you are thinking of that, then you are not thinking about go. Go has
shared state and mutable data. One of the major arguments here is, that
there is a level of isolation of state, which is very good, from all we
know, and that's the process; if the process dies, all locks are being
released, file descriptors closed and memory freed, so it gives a
known-good re-starting point. And that, in the presence of mutable state,
potential data races and code bugs, that is the correct layer of isolation
to fall back to. And I am also aware, that it's also not a perfect layer;
you might have already corrupted on-disk state or abused a protocol to
corrupt some state on the network. Those also need to be defended against,
but process isolation still gives a good tradeoff between efficiency,
convenience and safety.


FWIW, I don't believe there is any convincing to be done here on either
side. There are no technical arguments anymore; it is just that one set of
people are holding one belief and another set of people are holding another
belief. Both certainly do that based on technical arguments, but in the
end, they are simply weighing them differently.

I mean, I definitely agree that it would be great for a program to never
crash. Or to have only panics which definitely can't be recovered from. Or
to have all state isolated and safely expungeable. I agree, that the
process being up for a larger timeslice is valuable and that other requests
shouldn't fail because one of them misbehaved.

I also assume you agree that errors should be noticed, caught and fixed. I
assume you agree that crashing a binary will make the bug more noticeable.
That crashing would allow you to recover from a safer and better-known
state. And that being able to recover from any crash swiftly and
architecting a service so that processes dying doesn't take it down is
valuable and bugs shouldn't make it to production.

The facts are straight, this is just a question of opinion and different
experiences; and I don't see any way out of it than saying "agree to
disagree; if you don't think you can tolerate panic's, you just can't use
my stuff and I won't use yours, if I consider it to hide failures or be
unergonomic".
This argument becomes much more difficult, when I'm having it with my
coworkers, as it *does* depend on how the service is run, which needs to be
decided by the team; in regards to this thread, at least we all have the
luxury that we can agree to disagree and move on :)

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to