This should be changed also on the client side.

The libnbd part is here:
https://gitlab.com/nbdkit/libnbd/-/merge_requests/21

We may want to change also the nbd client code used in qemu-img. I can look at 
this later.


> On 18 Apr 2025, at 17:24, Nir Soffer <nir...@gmail.com> wrote:
> 
> Testing with qemu-nbd shows that computing a hash of an image via
> qemu-nbd is 5-7 times faster with this change.
> 
> Tested with 2 qemu-nbd processes:
> 
>    $ ./qemu-nbd-after -r -t -e 0 -f raw -k /tmp/after.sock 
> /var/tmp/bench/data-10g.img &
>    $ ./qemu-nbd-before -r -t -e 0 -f raw -k /tmp/before.sock 
> /var/tmp/bench/data-10g.img &
> 
> With nbdcopy, using 4 NBD connections:
> 
>    $ hyperfine -w 3 "./nbdcopy --blkhash 
> 'nbd+unix:///?socket=/tmp/before.sock' null:"
>                     "./nbdcopy --blkhash 
> 'nbd+unix:///?socket=/tmp/after.sock' null:"
>    Benchmark 1: ./nbdcopy --blkhash 'nbd+unix:///?socket=/tmp/before.sock' 
> null:
>      Time (mean ± σ):      8.670 s ±  0.025 s    [User: 5.670 s, System: 
> 7.113 s]
>      Range (min … max):    8.620 s …  8.703 s    10 runs
> 
>    Benchmark 2: ./nbdcopy --blkhash 'nbd+unix:///?socket=/tmp/after.sock' 
> null:
>      Time (mean ± σ):      1.839 s ±  0.008 s    [User: 4.651 s, System: 
> 1.882 s]
>      Range (min … max):    1.830 s …  1.853 s    10 runs
> 
>    Summary
>      ./nbdcopy --blkhash 'nbd+unix:///?socket=/tmp/after.sock' null: ran
>        4.72 ± 0.02 times faster than ./nbdcopy --blkhash 
> 'nbd+unix:///?socket=/tmp/before.sock' null:
> 
> With blksum, using one NBD connection:
> 
>    $ hyperfine -w 3 "blksum 'nbd+unix:///?socket=/tmp/before.sock'" \
>                     "blksum 'nbd+unix:///?socket=/tmp/after.sock'"
>    Benchmark 1: blksum 'nbd+unix:///?socket=/tmp/before.sock'
>      Time (mean ± σ):     13.606 s ±  0.081 s    [User: 5.799 s, System: 
> 6.231 s]
>      Range (min … max):   13.516 s … 13.785 s    10 runs
> 
>    Benchmark 2: blksum 'nbd+unix:///?socket=/tmp/after.sock'
>      Time (mean ± σ):      1.946 s ±  0.017 s    [User: 4.541 s, System: 
> 1.481 s]
>      Range (min … max):    1.912 s …  1.979 s    10 runs
> 
>    Summary
>      blksum 'nbd+unix:///?socket=/tmp/after.sock' ran
>        6.99 ± 0.07 times faster than blksum 
> 'nbd+unix:///?socket=/tmp/before.sock'
> 
> This will improve other usage of unix domain sockets on macOS, I tested
> only qemu-nbd.
> 
> Signed-off-by: Nir Soffer <nir...@gmail.com>
> ---
> io/channel-socket.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
> 
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 608bcf066e..b858659764 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -410,6 +410,19 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
>     }
> #endif /* WIN32 */
> 
> +#if __APPLE__
> +    /* On macOS we need to tune unix domain socket buffer for best 
> performance.
> +     * Apple recommends sizing the receive buffer at 4 times the size of the
> +     * send buffer.
> +     */
> +    if (cioc->localAddr.ss_family == AF_UNIX) {
> +        const int sndbuf_size = 1024 * 1024;
> +        const int rcvbuf_size = 4 * sndbuf_size;
> +        setsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, 
> sizeof(sndbuf_size));
> +        setsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, 
> sizeof(rcvbuf_size));
> +    }
> +#endif /* __APPLE__ */
> +
>     qio_channel_set_feature(QIO_CHANNEL(cioc),
>                             QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
> 
> -- 
> 2.39.5 (Apple Git-154)
> 


Reply via email to