Yes, I agree, I think this approach makes sense (and should have been obvious to me as something to try...). It could be implementable as a wrapper transport too. I'll try it out and reply back here if it doesn't work.
Thank-you! Jim On Wed, 3 Apr 2024 at 12:46, Robert Engels <reng...@ix.netcom.com> wrote: > Just create a recyclable transport for the bad server and put all of the > rest on a single shared transport. If one connection is returning 500 for > all requests I can’t see how a different connection would solve that - > unless the backend is completely broken. > > On Apr 3, 2024, at 7:48 AM, Eli Lindsey <e...@siliconsprawl.com> wrote: > > It would work, but has potentially high cost since it also causes any > healthy conns in the pool to be torn down. How useful it is in practice > depends on request rate, number of backends behind the lb, and ratio of > healthy to unhealthy (500’ing) connections. It’s hard to tell from the > description if it would work here - retrying and reusing the same busted > connection could mean that the request rate is very low and there’s only > one idle conn (in which case cycling the transport is a good solution), or > it could mean that the unhealthy conn is quicker to respond than the pooled > healthy conns and gobbles up a disproportionate share of requests. > > Tangential question, when the backend servers land in this state does the > lb not detect and remove them? > > -eli > > On Apr 3, 2024, at 6:41 AM, Robert Engels <reng...@ix.netcom.com> wrote: > > That probably wasn’t clear. Why not create a Transport per host. Then when > the 500 is encountered stop using that transport completely and create a > new instance. Probably want to cancel any requests currently in flight. > > The connection pool is per transport. > > On Apr 2, 2024, at 11:05 PM, Eli Lindsey <e...@siliconsprawl.com> wrote: > > There isn’t a great way to handle this currently - we maintain out of > tree patches to do something similar, though ours are h2 specific. The crux > of the problem is that net currently lacks a usable connection pool API > (there is some slightly newer discussion here, but it’s similar to the > issue you linked https://github.com/golang/go/discussions/60746). > > If you want to stay in tree, one option may be using httptrace GotConnInfo > and calling Close on the underlying connection (in direct violation of > GotConnInfo’s doc). I would expect this to error out anything inflight, but > otherwise be benign (though I have not checked :) ). > > -eli > > On Apr 2, 2024, at 3:29 PM, Jim Minter <j...@minter.uk> wrote: > > Hello, > > I was wondering if anyone had any ideas about > https://github.com/golang/go/issues/21978 ("net/http: no Client API to > close server connection based on Response") -- it's an old issue, but it's > something that's biting me currently and I can't see a neat way to solve it. > > As an HTTP client, I'm hitting a case where some HTTP server instance > behind a load balancer breaks and starts returning 500s (FWIW with no body) > and without the "Connection: close" header. I retry, but I end up reusing > the same TCP connection to the same broken HTTP instance, so I never hit a > different backend server and my retry policy is basically useless. > > Obviously I need to get the server owner to fix its behavior, but it would > be great if, as a client, there were a way to get net/http not to reuse the > connection further, in order to be less beholden to the server's behavior. > > This happens with both HTTP/1.1 and HTTP/2. > > If appropriate, I could live with the request to close the connection > racing with other new requests to the same endpoint. Getting to the point > where 2 or 3 requests fail and then the connection is closed is way better > than having requests fail ad infinitum. > > http.Transport.CloseIdleConnections() doesn't solve the problem well (a) > because it's a big hammer, and (b) because there's no guarantee that the > connection is idle when CloseIdleConnections() is called. > > FWIW I can see in `func (pc *persistConn) readLoop()` there's the > following test: > > ```go > if resp.Close || rc.req.Close || resp.StatusCode <= 199 || bodyWritable { > // Don't do keep-alive on error if either party requested a close > // or we get an unexpected informational (1xx) response. > // StatusCode 100 is already handled above. > alive = false > } > ``` > > I imagine that extending that to `if resp.Close || rc.req.Close || > resp.StatusCode <= 199 || bodyWritable || resp.StatusCode >= 500 {` might > probably help this specific case, but I imagine that's an unacceptably > large behavior change for the rest of the world. > > I'm not sure how else this could be done. Does anyone have any thoughts? > > Many thanks for the help, > > Jim > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/34d597cf-a84c-48eb-b555-537a8768f468n%40googlegroups.com > <https://groups.google.com/d/msgid/golang-nuts/34d597cf-a84c-48eb-b555-537a8768f468n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/080B6923-51DA-4DDB-9400-B1054C1DFCE4%40siliconsprawl.com > <https://groups.google.com/d/msgid/golang-nuts/080B6923-51DA-4DDB-9400-B1054C1DFCE4%40siliconsprawl.com?utm_medium=email&utm_source=footer> > . > > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/D0363149-4F68-42A9-8B5B-DFAD8AC36B87%40siliconsprawl.com > <https://groups.google.com/d/msgid/golang-nuts/D0363149-4F68-42A9-8B5B-DFAD8AC36B87%40siliconsprawl.com?utm_medium=email&utm_source=footer> > . > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAC1koBjmbiGDTDkGKE0%2BCdEroJcpma3vu128%3DQgrKDSY3N3GBw%40mail.gmail.com.