Hi Robert,

On Fri, Jun 23, 2023 at 11:33:37PM +0100, Robert Newson wrote:
> Hi,
> 
> I underestimated. the heartbeat option was added back in 2009, 14 years ago,
> but I don't want to fixate on whether we made this mistake long enough ago to
> justify distorting HAProxy.

OK!

> The CouchDB dev team are discussing this internally at the moment and I'll
> update this thread if/when any conclusion comes out of that. It was noted in
> that discussion that PouchDB (https://pouchdb.com/) does the same thing btw.

Thanks for the link. It's still not very clear to me what the *exact*
communication sequence is. PouchDB mentions REST so my feeling is that
the client sends requests that was and the server responds to each
request, but then that means that responses are using full messages and
there shouldn't be any delay. Thus it's not very clear to me when the
heartbeats are sent. Maybe while processing a request ? If that's the
case, using HTTP/1xx interim responses might work much better. For
example there used to be the 102 code used to indicate browsers that
some processing was in progress, though it's not recommended anymore
to send it to browsers which will simply ignore it. But I'm not sure
what motivates this "not recommended" given that any 1xx except 101
would fit since you can send several of them before a final response.

> Given the nature of the endpoint (it could well return large amounts of
> highly compressible data) we're not keen to disable compression on these
> responses as the ultimate fix, though that certainly _works_.

OK but just keep in mind that even without compression, *some* analysis
might require buffering a response. For example, some users might very
well have:

    http-response wait-for-body
    http-response deny if { res.body,lua.check_body }

where "lua.check_body" would be a Lua-based converter meant to verify
there is no information leak (credit card numbers, e-mail addresses
etc). And it could even be done using a regex.

With such a setup, the first 16kB (or more depending on the config) of
response body will be buffered without being forwarded until the buffer
is full, the response is complete or a rule matches. Again, with interim
responses this wouldn't be a problem because these would be forwarded
instantly.

> We could do as little as warn in our documentation that timely delivery of
> heartbeats is not guaranteed and as much as simply ignore the heartbeat
> request parameter and proceeding as if it were not set (thus the response is
> a stream of lots of actual data and then, after a short idle timer expires,
> it terminates cleanly with an empty chunk).

Maybe but that sounds a bit like giving up on an existing feature.

> Another thought is we could cause a configurable number of heartbeat chunks
> to be emitted instead of a single one to overcome any buffering by an
> intermediary, whether HAProxy or something else.

I don't think this would be effective. See the example above of buffering
a response, it could require 16k heartbeats to overcome a default buffer
analysis. And with compression it could be quite difficult as well, for
example a zlib-based compressor can work on up to 32kB input to produce
just a few bytes indicating the repetition length.

> In brief, we have options to ponder besides altering HAProxy in ways that
> violate both the letter and spirit of HTTP law.
> 
> On reflection, I don't think HAProxy should attempt to fix this problem, but
> I thank you for holding that out as an option.

OK. Without knowing more about the protocol and sequence itself, what I
could suggest is:
  - if you're only sending heartbeats before delivering a response,
    HTTP/1xx should probably be the cleanest way to proceed. If
    requests are sent as POST we could even imagine that emitting them
    with "expect: 100-continue" allows the server to regularly send 100
    for example. We could also bring the discussion to the IETF HTTP WG
    to figure why 102 is being deprecated, and whether some proxies
    merge multiple 100-continue to a single one.

  - if multiple responses are being sent as part of a single HTTP
    response and the heartbeats are placed between them, then there is
    no direct HTTP-compliant solution to this and you can only rely on
    network and intermediaries' absence of buffering. In this case we
    can try to find together how we can improve the situation for you,
    possibly even by detecting some communication patterns, by using
    some headers, or by using some explicit configuration. For example
    we already have "option http-no-delay" that was added a long time
    ago to avoid the 200ms TCP delay for interactive HTTP messages even
    though these do not comply with HTTP.
    
Thus do not hesitate to let us know.

Cheers,
Willy

Reply via email to