Hello everyone,

Over the past year, I've encountered two strange issues 
with net.ListenTCPand listener.Accept. Without explicitly 
enabling reuseport, multiple service processes on the same machine, all 
searching for available ports starting from 9000, managed to successfully 
call listenon the same IP and port. At least when calling net.ListenTCP, it 
returned err == nil, and the error only appeared during listener.Accept. 
However, at the time, we weren't explicitly checking the returned error or 
printing the error message. Instead, when we found the returned conn == 
nil, we kept retrying listener.Acceptin a for-loop.

We've reproduced this issue twice within a year. The environment was a 
virtual machine allocated on a physical host with a Linux 5.4 kernel, and 
it was very difficult to reproduce. Our immediate fix was to add the error 
checking logic and print the specific error. While handling this issue, we 
also ran into the problem with netError.Temporary().

I completely agree with Ian's insight: "Whether an error is temporary 
depends on what you were doing at the time." For the specific case 
of listener.Accept(), even if netError.Temporary()returns true, retrying 
doesn't necessarily mean the service can remain available. Errors always 
manifest in wildly different ways. In our specific flawed usage scenario, 
the service had already successfully registered with the name service, and 
other services had already discovered it and started sending requests. 
However, because the listenwasn't actually successful (the IP:port was held 
by another process), it resulted in persistent access failures.

But if we don't use Temporary(), asking developers to enumerate all 
possible temporary errors that can be retried isn't a very straightforward 
task. Could several categorical functions, similar to IsTimeout, be 
provided to allow developers to combine them freely? For example, something 
like if ne.IsTimeout() || ne.IsXXX() || ne.IsYYY().

On Friday, April 22, 2022 at 6:39:39 AM UTC+8 Caleb Spare wrote:

> On Thu, Apr 21, 2022 at 7:16 AM 'Bryan C. Mills' via golang-nuts
> <[email protected]> wrote:
> >
> > Even ENFILE and EMFILE are not necessarily blindly retriable: if the 
> process has run out of files, it may be because they have leaked (for 
> example, they may be reachable from deadlocked goroutines).
> > If that is the case, it is arguably better for the program to fail with 
> a useful error than to keep retrying without making progress.
> >
> > (I would argue that the retry loop in net/http.Server is a mistake, and 
> should be replaced with a user-configurable semaphore limiting the number 
> of open connections — thus avoiding the file exhaustion in the first place!)
>
> ENFILE might be caused by a different process entirely, no?
>
> >
> > On Wednesday, April 20, 2022 at 10:49:20 PM UTC-4 Ian Lance Taylor wrote:
> >>
> >> On Wed, Apr 20, 2022 at 6:46 PM 'Damien Neil' via golang-nuts
> >> <[email protected]> wrote:
> >> >
> >> > The reason for deprecating Temporary is that the set of "temporary" 
> errors was extremely ill-defined. The initial issue for 
> https://go.dev/issue/45729 discusses the de facto definition of Temporary 
> and the confusion resulting from it.
> >> >
> >> > Perhaps there's a useful definition of temporary or retriable errors, 
> perhaps limited in scope to syscall errors such as EINTR and EMFILE. I 
> don't know what that definition is, but perhaps we should come up with one 
> and add an os.ErrTemporary or some such. I don't think leaving 
> net.Error.Temporary undeprecated was the right choice, however; the need 
> for a good way to identify transient system errors such as EMFILE doesn't 
> mean that it was a good way to do so or could ever be made into one.
> >>
> >> To frame issue 45729 in a different way, whether an error is temporary
> >> is not a general characteristic. It depends on the context in which
> >> it appears. For the Accept loop in http.Server.Serve really the only
> >> plausible temporary errors are ENFILE and EMFILE. Perhaps the net
> >> package needs a RetriableAcceptError function.
> >>
> >> Ian
> >>
> >>
> >>
> >> > On Wednesday, April 20, 2022 at 6:02:34 PM UTC-7 [email protected] 
> wrote:
> >> >>
> >> >> In Go 1.18 net.Error.Temporary was deprecated (see
> >> >> https://go.dev/issue/45729). However, in trying to remove it from my
> >> >> code, I found one way in which Temporary is used for which there is 
> no
> >> >> obvious replacement: in a TCP server's Accept loop, when deciding
> >> >> whether to wait and retry an Accept error.
> >> >>
> >> >> You can see an example of this in net/http.Server today:
> >> >> 
> https://github.com/golang/go/blob/ab9d31da9e088a271e656120a3d99cd3b1103ab6/src/net/http/server.go#L3047-L3059
> >> >>
> >> >> In this case, Temporary seems useful, and enumerating the OS-specific
> >> >> errors myself doesn't seem like a good idea.
> >> >>
> >> >> Does anyone have a good solution here? It doesn't seem like this was
> >> >> adequately considered when making this deprecation decision.
> >> >>
> >> >> Caleb
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google 
> Groups "golang-nuts" group.
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to [email protected].
> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/1024e668-795f-454f-a659-ab5a4bf9517cn%40googlegroups.com
> .
> >
> > --
> > You received this message because you are subscribed to the Google 
> Groups "golang-nuts" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to [email protected].
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/1826b3b5-c147-4015-9769-984fd84eacb3n%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/a5af891e-f3b1-43d2-8067-ff86af108e57n%40googlegroups.com.

Reply via email to