Just create a recyclable transport for the bad server and put all of the rest on a single shared transport. If one connection is returning 500 for all requests I can’t see how a different connection would solve that - unless the backend is completely broken. 

On Apr 3, 2024, at 7:48 AM, Eli Lindsey <e...@siliconsprawl.com> wrote:

It would work, but has potentially high cost since it also causes any healthy conns in the pool to be torn down. How useful it is in practice depends on request rate, number of backends behind the lb, and ratio of healthy to unhealthy (500’ing) connections. It’s hard to tell from the description if it would work here - retrying and reusing the same busted connection could mean that the request rate is very low and there’s only one idle conn (in which case cycling the transport is a good solution), or it could mean that the unhealthy conn is quicker to respond than the pooled healthy conns and gobbles up a disproportionate share of requests.

Tangential question, when the backend servers land in this state does the lb not detect and remove them?

-eli

On Apr 3, 2024, at 6:41 AM, Robert Engels <reng...@ix.netcom.com> wrote:

That probably wasn’t clear. Why not create a Transport per host. Then when the 500 is encountered stop using that transport completely and create a new instance. Probably want to cancel any requests currently in flight. 

The connection pool is per transport. 

On Apr 2, 2024, at 11:05 PM, Eli Lindsey <e...@siliconsprawl.com> wrote:

There isn’t a great way to handle this currently - we maintain out of tree patches to do something similar, though ours are h2 specific. The crux of the problem is that net currently lacks a usable connection pool API (there is some slightly newer discussion here, but it’s similar to the issue you linked https://github.com/golang/go/discussions/60746).

If you want to stay in tree, one option may be using httptrace GotConnInfo and calling Close on the underlying connection (in direct violation of GotConnInfo’s doc). I would expect this to error out anything inflight, but otherwise be benign (though I have not checked :) ).

-eli

On Apr 2, 2024, at 3:29 PM, Jim Minter <j...@minter.uk> wrote:

Hello,

I was wondering if anyone had any ideas about https://github.com/golang/go/issues/21978 ("net/http: no Client API to close server connection based on Response") -- it's an old issue, but it's something that's biting me currently and I can't see a neat way to solve it.

As an HTTP client, I'm hitting a case where some HTTP server instance behind a load balancer breaks and starts returning 500s (FWIW with no body) and without the "Connection: close" header.  I retry, but I end up reusing the same TCP connection to the same broken HTTP instance, so I never hit a different backend server and my retry policy is basically useless.

Obviously I need to get the server owner to fix its behavior, but it would be great if, as a client, there were a way to get net/http not to reuse the connection further, in order to be less beholden to the server's behavior.

This happens with both HTTP/1.1 and HTTP/2.

If appropriate, I could live with the request to close the connection racing with other new requests to the same endpoint.  Getting to the point where 2 or 3 requests fail and then the connection is closed is way better than having requests fail ad infinitum.

http.Transport.CloseIdleConnections() doesn't solve the problem well (a) because it's a big hammer, and (b) because there's no guarantee that the connection is idle when CloseIdleConnections() is called.

FWIW I can see in `func (pc *persistConn) readLoop()` there's the following test:

```go
if resp.Close || rc.req.Close || resp.StatusCode <= 199 || bodyWritable {
// Don't do keep-alive on error if either party requested a close
// or we get an unexpected informational (1xx) response.
// StatusCode 100 is already handled above.
alive = false
}
```

I imagine that extending that to `if resp.Close || rc.req.Close || resp.StatusCode <= 199 || bodyWritable || resp.StatusCode >= 500 {` might probably help this specific case, but I imagine that's an unacceptably large behavior change for the rest of the world.

I'm not sure how else this could be done.  Does anyone have any thoughts?

Many thanks for the help,

Jim

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/34d597cf-a84c-48eb-b555-537a8768f468n%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/080B6923-51DA-4DDB-9400-B1054C1DFCE4%40siliconsprawl.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/D0363149-4F68-42A9-8B5B-DFAD8AC36B87%40siliconsprawl.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/3E01990B-0029-4D69-9EDE-12AF227FA7BD%40ix.netcom.com.

Reply via email to