Hi Maximilian, sorry for the delay, I'm sure I noticed the subject but likely archived the message before reading it. Thanks to Lukas for pinging again about it in GH issue #352 which was already about this!
On Tue, Jul 23, 2024 at 08:38:44AM +0000, Moehl, Maximilian wrote: > We've received reports from users that are struggling to upload large > files via our HAProxies. With some testing, we discovered that h1 is > relatively stable as latency increases while h2 constantly degrades. > We've narrowed the issue down to `tune.h2.fe.initial-window-size` which > states: > > > The default value of 65536 allows up to 5 Mbps of bandwidth per client > > over a 100 ms ping time, and 500 Mbps for 1 ms ping time. > > Which is in line with our experiments. > > While researching the topic, I've come across a blog post from > Cloudflare [1] where they mention improvements to their edge which > dynamically adjusts the window size to accommodate the latency and > bandwidth constraints. They've upstreamed the patch to nginx [2] > (although I must admit that I'm not sure whether the change is fixing a > nginx specific issue or the general issue we are experiencing here). > > Is there a similar mechanism in HAProxy? So far I can only see the > static option for the initial window size which comes with the mentioned > drawbacks. There is nothing similar. One of the problems H2 is facing is that there can be application congestion anywhere in the chain. For example, let's say a POST request is sent to a server subject to a maxconn and remains in the queue for a few seconds. We definitely don't want to block the whole connection during this time because we've oversized the window. And there's also the problem of not allocating too many buffers to each stream, or it becomes a trivial DoS. However, as I mentioned somewhere (maybe the issue above, I don't remember), I think that the vast majority of users are not downloading and uploading at the same time over a same connection. Either they're uploading a large file or downloading contents to be rendered. Maybe in this case we could consider that the allocatable buffers for the mux could in fact be repurposed for any stream. We already have one per-stream buffer. If the stream getting the POST was able to allocate more buffers, it could then release the connection from pending frames and allow the client to use a larger send window. One of the difficulties is to figure how to allocate them but we can afford a few round trips during a large upload, so actually the stream itself could increase the window when allocating new buffers. In our case, buffers were really not meant to be extensible on the rx side, but I suspect it might not be too hard to do. I'll also need to have a look at what Cloudflare did for NGINX, they might have updated that based on their observations and corner cases. That's something to think about. Thanks for re-heating this old topic! Willy