I searched the web and Linux commit history, after which I only found two meaningful pieces of information: 1) before linux 2.6, there's double bind race 2) linux 6.0.16, there's double bind race but it seems there's no reports in 5.x kernel.
--- And, I talked with chatgpt, it says: ``` "Double bind race" refers to a scenario where multiple threads/CPUs attempt to bind() to the same (IP, port, proto) almost simultaneously. Due to a race condition window in the kernel when creating and inserting an inet_bind_bucket (port binding bucket), the following may occur : - Both threads may believe the port is available. - Both threads may create their own inet_bind_bucket. - The kernel might ultimately insert one bucket, but in an inconsistent state. - This can lead to one thread's bind operation failing with an unexpected error (e.g., not EADDRINUSE), or, in older versions, even result in a temporary "successful duplicate bind" (which theoretically should not happen) . This type of race condition is typically difficult to reproduce and requires a multi-core environment with near-instantaneous concurrent attempts to bind to the same port . ``` I don't know if the kernel bug really exists, or is it caused by some virtualization technology bugs. On Tuesday, November 18, 2025 at 8:49:18 PM UTC+8 Zhang Jie (Kn) wrote: > The release is Tencent tlinux3, the kernel is Linux 5.4, it's modified by > Tencent. > > --- > > In golang, net.ListenTCP will set REUSEADDR to quickly reuse the same > ipport, but listen twice shouldn't success unless REUSEPORT set. > > When the problem occurs, we try use `fuser port/tcp` to check if there's > only one process listening on the same ipport. Yes, there's only one. > The other process trying to listen on the same ipport succeeded: > ``` > ln, err := net.ListenTCP(...), > ``` > here err is nil. > > Then: > ``` > conn, err := ln.Accept() > ``` > here conn is nil, and err != nil, but in our previous code, the err is > ignored (bad practice), I didn't know what error it returned. > And I cannot reproduce this problem. > On Tuesday, November 18, 2025 at 8:38:59 PM UTC+8 Robert Engels wrote: > >> >> I believe that if the port has pre ious connections still in the >> CLOSE_WAIT state (could be a previous run of the same app) the port cannot >> be opened. >> >> Linux also has a REUSE_PORT option that allows multiple processes to >> bind to the same port and it balances the incoming requests automatically. >> >> On Nov 18, 2025, at 3:36 AM, 'Brian Candler' via golang-nuts < >> [email protected]> wrote: >> >> When the problem occurs, I suggest you look at "ss -natp" ("netstat >> -natp" on older systems) and see if you really do have two listening >> sockets on the same port and address. >> >> >> If you do, that seems like a kernel bug / some sort of race. What kernel >> version is the VM running? (The kernel on the physical host shouldn't >> really make any difference). >> >> On Tuesday, 18 November 2025 at 03:11:24 UTC Zhang Jie (Kn) wrote: >> >>> Hello everyone, >>> >>> Over the past year, I've encountered two strange issues >>> with net.ListenTCPand listener.Accept. Without explicitly >>> enabling reuseport, multiple service processes on the same machine, all >>> searching for available ports starting from 9000, managed to successfully >>> call listenon the same IP and port. At least when calling net.ListenTCP, it >>> returned err == nil, and the error only appeared during listener.Accept. >>> However, at the time, we weren't explicitly checking the returned error or >>> printing the error message. Instead, when we found the returned conn == >>> nil, we kept retrying listener.Acceptin a for-loop. >>> >>> We've reproduced this issue twice within a year. The environment was a >>> virtual machine allocated on a physical host with a Linux 5.4 kernel, and >>> it was very difficult to reproduce. Our immediate fix was to add the error >>> checking logic and print the specific error. While handling this issue, we >>> also ran into the problem with netError.Temporary(). >>> >>> I completely agree with Ian's insight: "Whether an error is temporary >>> depends on what you were doing at the time." For the specific case >>> of listener.Accept(), even if netError.Temporary()returns true, retrying >>> doesn't necessarily mean the service can remain available. Errors always >>> manifest in wildly different ways. In our specific flawed usage scenario, >>> the service had already successfully registered with the name service, and >>> other services had already discovered it and started sending requests. >>> However, because the listenwasn't actually successful (the IP:port was held >>> by another process), it resulted in persistent access failures. >>> >>> But if we don't use Temporary(), asking developers to enumerate all >>> possible temporary errors that can be retried isn't a very straightforward >>> task. Could several categorical functions, similar to IsTimeout, be >>> provided to allow developers to combine them freely? For example, something >>> like if ne.IsTimeout() || ne.IsXXX() || ne.IsYYY(). >>> >>> On Friday, April 22, 2022 at 6:39:39 AM UTC+8 Caleb Spare wrote: >>> >>>> On Thu, Apr 21, 2022 at 7:16 AM 'Bryan C. Mills' via golang-nuts >>>> <[email protected]> wrote: >>>> > >>>> > Even ENFILE and EMFILE are not necessarily blindly retriable: if the >>>> process has run out of files, it may be because they have leaked (for >>>> example, they may be reachable from deadlocked goroutines). >>>> > If that is the case, it is arguably better for the program to fail >>>> with a useful error than to keep retrying without making progress. >>>> > >>>> > (I would argue that the retry loop in net/http.Server is a mistake, >>>> and should be replaced with a user-configurable semaphore limiting the >>>> number of open connections — thus avoiding the file exhaustion in the >>>> first >>>> place!) >>>> >>>> ENFILE might be caused by a different process entirely, no? >>>> >>>> > >>>> > On Wednesday, April 20, 2022 at 10:49:20 PM UTC-4 Ian Lance Taylor >>>> wrote: >>>> >> >>>> >> On Wed, Apr 20, 2022 at 6:46 PM 'Damien Neil' via golang-nuts >>>> >> <[email protected]> wrote: >>>> >> > >>>> >> > The reason for deprecating Temporary is that the set of >>>> "temporary" errors was extremely ill-defined. The initial issue for >>>> https://go.dev/issue/45729 discusses the de facto definition of >>>> Temporary and the confusion resulting from it. >>>> >> > >>>> >> > Perhaps there's a useful definition of temporary or retriable >>>> errors, perhaps limited in scope to syscall errors such as EINTR and >>>> EMFILE. I don't know what that definition is, but perhaps we should come >>>> up >>>> with one and add an os.ErrTemporary or some such. I don't think leaving >>>> net.Error.Temporary undeprecated was the right choice, however; the need >>>> for a good way to identify transient system errors such as EMFILE doesn't >>>> mean that it was a good way to do so or could ever be made into one. >>>> >> >>>> >> To frame issue 45729 in a different way, whether an error is >>>> temporary >>>> >> is not a general characteristic. It depends on the context in which >>>> >> it appears. For the Accept loop in http.Server.Serve really the only >>>> >> plausible temporary errors are ENFILE and EMFILE. Perhaps the net >>>> >> package needs a RetriableAcceptError function. >>>> >> >>>> >> Ian >>>> >> >>>> >> >>>> >> >>>> >> > On Wednesday, April 20, 2022 at 6:02:34 PM UTC-7 [email protected] >>>> wrote: >>>> >> >> >>>> >> >> In Go 1.18 net.Error.Temporary was deprecated (see >>>> >> >> https://go.dev/issue/45729). However, in trying to remove it >>>> from my >>>> >> >> code, I found one way in which Temporary is used for which there >>>> is no >>>> >> >> obvious replacement: in a TCP server's Accept loop, when deciding >>>> >> >> whether to wait and retry an Accept error. >>>> >> >> >>>> >> >> You can see an example of this in net/http.Server today: >>>> >> >> >>>> https://github.com/golang/go/blob/ab9d31da9e088a271e656120a3d99cd3b1103ab6/src/net/http/server.go#L3047-L3059 >>>> >>>> >> >> >>>> >> >> In this case, Temporary seems useful, and enumerating the >>>> OS-specific >>>> >> >> errors myself doesn't seem like a good idea. >>>> >> >> >>>> >> >> Does anyone have a good solution here? It doesn't seem like this >>>> was >>>> >> >> adequately considered when making this deprecation decision. >>>> >> >> >>>> >> >> Caleb >>>> >> > >>>> >> > -- >>>> >> > You received this message because you are subscribed to the Google >>>> Groups "golang-nuts" group. >>>> >> > To unsubscribe from this group and stop receiving emails from it, >>>> send an email to [email protected]. >>>> >> > To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/golang-nuts/1024e668-795f-454f-a659-ab5a4bf9517cn%40googlegroups.com. >>>> >>>> >>>> > >>>> > -- >>>> > You received this message because you are subscribed to the Google >>>> Groups "golang-nuts" group. >>>> > To unsubscribe from this group and stop receiving emails from it, >>>> send an email to [email protected]. >>>> > To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/golang-nuts/1826b3b5-c147-4015-9769-984fd84eacb3n%40googlegroups.com. >>>> >>>> >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "golang-nuts" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> To view this discussion visit >> https://groups.google.com/d/msgid/golang-nuts/86d641cd-4503-4568-b491-f82b5fa705c9n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/golang-nuts/86d641cd-4503-4568-b491-f82b5fa705c9n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/0f47147b-5911-4f67-aa45-8eb00e722f5fn%40googlegroups.com.
