>I also take this chance to ask a second question, out of curiosity: with >HTTP/2 multiplex enabled, will libcurl also attempt to open concurrent >connections, and do multiplex on all these connections? Or does it stick >to one connection?
To what I've observed, libcurl would try to open new connection to same host depending on CURLMOPT_MAX_HOST_CONNECTIONS & CURLMOPT_MAX_TOTAL_CONNECTIONS Following the piece of code from url.c /* If we found a reusable connection that is now marked as in use, we may still want to open a new connection if we are pipelining. */ if(reuse && !force_reuse && IsPipeliningPossible(data, conn_temp)) { size_t pipelen = conn_temp->send_pipe.size + conn_temp->recv_pipe.size; if(pipelen > 0) { infof(data, "Found connection %ld, with requests in the pipe (%zu) conn_bund:%u maxperhost:%u conncache:%u maxtotal:%u\n", conn_temp->connection_id, pipelen, Curl_conncache_bundle_size(conn_temp), max_host_connections, Curl_conncache_size(data), max_total_connections); * if(Curl_conncache_bundle_size(conn_temp) < max_host_connections && Curl_conncache_size(data) < max_total_connections) {* /* We want a new connection anyway */ reuse = FALSE; infof(data, "We can reuse, but we want a new connection anyway\n"); Curl_conncache_return_conn(conn_temp); } } } ~Kunal On Thu, Mar 21, 2019 at 11:49 AM Arnaud Rebillout via curl-library < curl-library@cool.haxx.se> wrote: > Hi libcurl devs, > > I'm writing an application that uses libcurl, and I have no prior > expertise with HTTP, so I'd like to make sure I got things right. > > I'm working on the internal http client of casync [1]. This client is > simple, it basically has a list of files to download, and its job is to > download it efficiently. We're talking about small chunks of data > (around 64KB), but the list is possibly huge (60,000 chunks is very > possible). And we talk to only ONE server. > > Since we live in a modern world, I explicitly enable > `CURL_HTTP_VERSION_2_0` and `CURLPIPE_MULTIPLEX`, and I assume that the > server supports it. > > In a **first implementation**, I just create a curl easy handle for each > chunk I need to download (so, possibly 60k easy handles), add it to the > curl multi, and then I let curl deal with it. I also make sure to set > `CURLMOPT_MAX_TOTAL_CONNECTIONS` to ensure that the whole thing doesn't > go crazy (I used 64 at first, but after more reading I wonder if I > should lower that to 8). > > It works good this way. Even too good. My issue then, during local tests > (with both client and server on my machine), is that the client isn't > fast enough to handle all the incoming chunks. Indeed, the client needs > to give the chunks to another co-process, through a custom IPC, and this > proved to be the bottleneck. So what happened is that all my chunks were > downloaded very quickly, and then sat in RAM until the client had time > to forward it to its co-process. Even though it works, it possibly uses > a lot of RAM and it's not nice. > > Of course, this doesn't happen in "real-life", when the server is away > and the latency is higher. Then the client has time to handle the > chunks, and everything works beautifully. > > I didn't find a way to to tell libcurl to pause or slow down in case > things go too fast, so I went for a **second implementation**, slightly > different. I decided that instead of creating one easy handle per chunk > request and feed it all to the curl multi handle, I would only create a > small number of easy handles (let's say 8) and give it to curl multi. > Only when a chunk is downloaded and handled by the client, then I re-use > the easy handle (ie. remove it from the multi handle, set a new URL, and > give it back to the curl multi for processing). > > This implementation works good as well. > > Now, I take a bit of time to think, and I wonder if this second > implementation is really the smart thing to do. More precisely: by > feeding handles one by one (even though we might have 8 active handles > in curl multi at the same time), do I prevent internal optimization > within libcurl? How can libcurl multiplex efficiently if I don't tell it > in advance the list of chunks I want to download? > > So basically, I think that my first implementation was better than the > second one, can you agree or disagree, based on your knowledge of > libcurl internals? > > I also take this chance to ask a second question, out of curiosity: with > HTTP/2 multiplex enabled, will libcurl also attempt to open concurrent > connections, and do multiplex on all these connections? Or does it stick > to one connection? > > Thanks! > > Arnaud > > ---- > > [1]: > > http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html > > ------------------------------------------------------------------- > Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library > Etiquette: https://curl.haxx.se/mail/etiquette.html -- ~Kunal
------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html