Thanks Boris.  

I promised Boris I would explain the problem that the idem library solves 
with traditional
sync.WaitGroup usage, as found in his referenced library, if he posted his 
suggestion 
back here on the list instead of in a dm, so here goes:

Imagine you have a backup server that you are sending lots of files to.
Further imagine that you have alot of goroutines in running
in parallel to handle the uploads.

Now lets suppose that we want to handle an out-of-disk-space condition 
gracefully.
Really this could be any error where we want to shut down a worker pool
of goroutines early, and we've been using a sync.WaitGroup for to manage 
them.

So our goal is: we've run out of disk, and now we want to stop all the 
in-progress
goroutines from spewings errors all over our logs, but without panic-ing. 
Suppose we want our server to stay up and responsive and answering 
questions 
about its health, and telling us about its disk space problem; but maybe 
still
serving reads, just not writes. Anyway... you get the point, the goal is: 
don't panic, 
and shut down a pool of goroutines gracefully.

We can always panic and catch the panics, but that could also hide
some very different errors, or logic errors in the code itself, and so we 
want to, as
a general goal, avoid using panic as a catch all.

If we are using a sync.WaitGroup, and have say 40 upload jobs in progress 
when we
run out disk space, then we need to cancel all the jobs in progress before
they have finished.  

But, because we have used a traditional sync.WaitGroup,
we are hosed. We cannot cancel the Wait once it has started. There is 
no mechanism designed in to do that.

Now you say, well, we could artificially decrement the wait count down to 
zero to release
the Wait call, right? Well, that runs into the panic problem, because now 
you
have a race with any goroutine that is finishing right as you want to force
the wait to end by artificially decrementing the wait group count. That race
means that you are just as likely to spuriously panic when trying to abort 
the
wait as not.  So that is a non-starter, given the rules of the game 
outlined above:
no spurious panics--as they hide real problems and/or bad logic. 

So this is the problem that a idem.IdemCloseChan solves with the methods
TaskAdd, TaskDone, and TaskWait.  
See https://github.com/glycerine/idem/blob/master/halter.go#L427

Specially, TaskWait takes a giveup channel (that could come from a 
context), 
and you can also just close the IdemCloseChan to stop waiting. 

This gives you a clean early exit from waiting on a count of tasks--and no 
spurious panics to suppress.

On Saturday, February 15, 2025 at 2:20:44 PM UTC Nagaev Boris wrote:

> On Thu, Feb 13, 2025 at 6:11 PM Jason E. Aten <j.e....@gmail.com> wrote:
>
> > 3) I almost always need to know when my goroutines are done,
> >
> > and to shut them all down in case of an error from one.
> >
> >
> > My idem package provides goroutine supervision trees
> >
> > and also shows how to integrate sync.WaitGroup-style
> >
> > task counting effectively with Go channels. Normal
> >
> > sync.WaitGroup is a hazard in production code because
> >
> > you cannot abort a Wait in case of error or shutdown.
> >
> >
> > https://github.com/glycerine/idem
>
>
> Hey Jason!
>
> Maybe this one can be interesting for you:
> https://pkg.go.dev/github.com/lightningnetwork/lnd/fn/v2#GoroutineManager
> It is based on a mutex and a WaitGroup and utilizes context.Context.
>
>
> --
> Best regards,
> Boris Nagaev
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/4623e193-bef6-46f4-bbdb-c4af00241c70n%40googlegroups.com.

Reply via email to