This experimental series changes the way that the curl plugin deals with libcurl handles. It also changes the thread model of the plugin from SERIALIZE_REQUESTS to PARALLEL.
Currently one NBD connection opens one libcurl handle. This also implies one TCP connection to the web server. If you want to open multiple libcurl handles (and multiple TCP connections), the client must open multiple NBD connections, eg. using multi-conn. After this series, there is a pool of libcurl handles shared across all NBD connections. The pool defaults to 4 handles, but this can be changed using the connections=N parameter. Previously the plugin relied on nbdkit SERIALIZE_REQUESTS to ensure that a curl handle could not be used from multiple threads at the same time (https://curl.se/libcurl/c/threadsafe.html). After this change it is possible to use the PARALLEL thread model. This change is quite valuable because it means we can use filters like readahead and scan. Anyway, this all seems to work, but it actually reduces performance :-( In particular this simple test slows down quite substantially: time ./nbdkit -r -U - curl file:/var/tmp/fedora-36.img --run 'nbdcopy --no-extents -p "$uri" null:' (where /var/tmp/fedora-36.img is a 10G file). I've been looking at flamegraphs all morning and I can't really see what the problem is (except that lots more time is spent with libcurl calling sigaction?!?) I'm wondering if it might be a locality issue, since curl handles are now being scattered randomly across threads. (It might mean in the file: case that Linux kernel readahead is ineffective). I can't easily see a way to change the implementation to encourage handles to be reused by the same thread. Well, here we are ... Rich. _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs