On Thu, 21 Mar 2019, Arnaud Rebillout via curl-library wrote:

Since we live in a modern world, I explicitly enable `CURL_HTTP_VERSION_2_0` and `CURLPIPE_MULTIPLEX`, and I assume that the server supports it.

Since curl 7.62.0, those will be enabled by default anyway!

In a **first implementation**, I just create a curl easy handle for each chunk I need to download (so, possibly 60k easy handles), add it to the curl multi, and then I let curl deal with it. I also make sure to set `CURLMOPT_MAX_TOTAL_CONNECTIONS` to ensure that the whole thing doesn't go crazy (I used 64 at first, but after more reading I wonder if I should lower that to 8).

For all easy handles you add to the multi handle, libcurl will try to complete that transfer as soon as possible so unless you limit it somehow it runs them all in parallel.

Whenever a new transfer is about to be done it will check if there's an existing connection available to multiplex on and will do so if possible. If not, it will create a new connection instead (unless it reached a limit first or was informed with PIPEWAIT that waiting for a connection to multiplex on is okay).

I didn't find a way to to tell libcurl to pause or slow down in case
things go too fast,

You can either curl_easy_pause() a single transfer momentarily or you can set CURLOPT_MAX_RECV_SPEED_LARGE to cap the speed for a single transfer.

Both of these work on a single transfer though. We don't have settings that limit the transfer speed of multiple, combined, transfers.

Now, I take a bit of time to think, and I wonder if this second implementation is really the smart thing to do. More precisely: by feeding handles one by one (even though we might have 8 active handles in curl multi at the same time), do I prevent internal optimization within libcurl? How can libcurl multiplex efficiently if I don't tell it in advance the list of chunks I want to download?

It will multiplex equally good. Each new transfer you ask for will join an existing connection - if possible - at the time it starts. There's really no difference to curl, that decision is made when the transfer starts anyway. The main difference between your two solutions is that in the first case you hand over a lot of the transfer queueing to curl, while you do it yourself in the second case.

Without having all the factors and knowledge of the solution that you do, I would say that the second solution sounds more flexible for you. That way gives you more room for your application to act depending on circumstances during the transfer.

I also take this chance to ask a second question, out of curiosity: with HTTP/2 multiplex enabled, will libcurl also attempt to open concurrent connections, and do multiplex on all these connections? Or does it stick to one connection?

If there's a multiplexible connection available, that one will be used.

If not, you can make it prefer multiplexing to new connections by setting CURLOPT_PIPEWAIT, or even limit the number of connections or host connections that it is allowed to use.

--

 / daniel.haxx.se
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to