Hi Robert, On Fri, Jun 23, 2023 at 11:33:37PM +0100, Robert Newson wrote: > Hi, > > I underestimated. the heartbeat option was added back in 2009, 14 years ago, > but I don't want to fixate on whether we made this mistake long enough ago to > justify distorting HAProxy.
OK! > The CouchDB dev team are discussing this internally at the moment and I'll > update this thread if/when any conclusion comes out of that. It was noted in > that discussion that PouchDB (https://pouchdb.com/) does the same thing btw. Thanks for the link. It's still not very clear to me what the *exact* communication sequence is. PouchDB mentions REST so my feeling is that the client sends requests that was and the server responds to each request, but then that means that responses are using full messages and there shouldn't be any delay. Thus it's not very clear to me when the heartbeats are sent. Maybe while processing a request ? If that's the case, using HTTP/1xx interim responses might work much better. For example there used to be the 102 code used to indicate browsers that some processing was in progress, though it's not recommended anymore to send it to browsers which will simply ignore it. But I'm not sure what motivates this "not recommended" given that any 1xx except 101 would fit since you can send several of them before a final response. > Given the nature of the endpoint (it could well return large amounts of > highly compressible data) we're not keen to disable compression on these > responses as the ultimate fix, though that certainly _works_. OK but just keep in mind that even without compression, *some* analysis might require buffering a response. For example, some users might very well have: http-response wait-for-body http-response deny if { res.body,lua.check_body } where "lua.check_body" would be a Lua-based converter meant to verify there is no information leak (credit card numbers, e-mail addresses etc). And it could even be done using a regex. With such a setup, the first 16kB (or more depending on the config) of response body will be buffered without being forwarded until the buffer is full, the response is complete or a rule matches. Again, with interim responses this wouldn't be a problem because these would be forwarded instantly. > We could do as little as warn in our documentation that timely delivery of > heartbeats is not guaranteed and as much as simply ignore the heartbeat > request parameter and proceeding as if it were not set (thus the response is > a stream of lots of actual data and then, after a short idle timer expires, > it terminates cleanly with an empty chunk). Maybe but that sounds a bit like giving up on an existing feature. > Another thought is we could cause a configurable number of heartbeat chunks > to be emitted instead of a single one to overcome any buffering by an > intermediary, whether HAProxy or something else. I don't think this would be effective. See the example above of buffering a response, it could require 16k heartbeats to overcome a default buffer analysis. And with compression it could be quite difficult as well, for example a zlib-based compressor can work on up to 32kB input to produce just a few bytes indicating the repetition length. > In brief, we have options to ponder besides altering HAProxy in ways that > violate both the letter and spirit of HTTP law. > > On reflection, I don't think HAProxy should attempt to fix this problem, but > I thank you for holding that out as an option. OK. Without knowing more about the protocol and sequence itself, what I could suggest is: - if you're only sending heartbeats before delivering a response, HTTP/1xx should probably be the cleanest way to proceed. If requests are sent as POST we could even imagine that emitting them with "expect: 100-continue" allows the server to regularly send 100 for example. We could also bring the discussion to the IETF HTTP WG to figure why 102 is being deprecated, and whether some proxies merge multiple 100-continue to a single one. - if multiple responses are being sent as part of a single HTTP response and the heartbeats are placed between them, then there is no direct HTTP-compliant solution to this and you can only rely on network and intermediaries' absence of buffering. In this case we can try to find together how we can improve the situation for you, possibly even by detecting some communication patterns, by using some headers, or by using some explicit configuration. For example we already have "option http-no-delay" that was added a long time ago to avoid the 200ms TCP delay for interactive HTTP messages even though these do not comply with HTTP. Thus do not hesitate to let us know. Cheers, Willy