> On 18 Apr 2025, at 21:55, Eric Blake <ebl...@redhat.com> wrote: > > On Fri, Apr 18, 2025 at 05:24:36PM +0300, Nir Soffer wrote: >> Testing with qemu-nbd shows that computing a hash of an image via >> qemu-nbd is 5-7 times faster with this change. >> > >> +++ b/io/channel-socket.c >> @@ -410,6 +410,19 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, >> } >> #endif /* WIN32 */ >> >> +#if __APPLE__ >> + /* On macOS we need to tune unix domain socket buffer for best >> performance. >> + * Apple recommends sizing the receive buffer at 4 times the size of the >> + * send buffer. >> + */ >> + if (cioc->localAddr.ss_family == AF_UNIX) { >> + const int sndbuf_size = 1024 * 1024; >> + const int rcvbuf_size = 4 * sndbuf_size; >> + setsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, >> sizeof(sndbuf_size)); >> + setsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, >> sizeof(rcvbuf_size)); >> + } >> +#endif /* __APPLE__ */ > > Why does this have to be limited? On linux, 'man 7 unix' documents > that SO_SNDBUF is honored (SO_RCVBUF is silently ignored but accepted > for compatibility). On the other hand, 'man 7 socket' states that it > defaults to the value in /proc/sys/net/core/wmem_default (212992 on my > machine) and cannot exceed the value in /proc/sys/net/core/wmem_max > without CAP_NET_ADMIN privileges (also 212992 on my machine). > > Of course, Linux and MacOS are different kernels, so your effort to > set it to 1M may actually be working on Apple rather than being > silently cut back to the enforced maximum.
Testing shows values up to 2m send buffer, 8m receive buffer shows the values changes the performance, so they are not silently clipped. > And the fact that raising > it at all makes a difference merely says that unlike Linux (where the > default appears to already be as large as possible), Apple is set up > to default to a smaller buffer (more fragmentation requires more > time), and bumping to the larger value improves performance. But can > you use getsockopt() prior to your setsockopt() to see what value > Apple was defaulting to, and then again afterwards to see whether it > actually got as large as you suggested? Sure, tested with: diff --git a/io/channel-socket.c b/io/channel-socket.c index b858659764..9600a076be 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -418,8 +418,21 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, if (cioc->localAddr.ss_family == AF_UNIX) { const int sndbuf_size = 1024 * 1024; const int rcvbuf_size = 4 * sndbuf_size; + int value; + socklen_t value_size = sizeof(value); + + getsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &value, &value_size); + fprintf(stderr, "before: send buffer size: %d\n", value); + getsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &value, &value_size); + fprintf(stderr, "before: recv buffer size: %d\n", value); + setsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, sizeof(sndbuf_size)); setsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, sizeof(rcvbuf_size)); + + getsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &value, &value_size); + fprintf(stderr, "after: send buffer size: %d\n", value); + getsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &value, &value_size); + fprintf(stderr, "after: recv buffer size: %d\n", value); } #endif /* __APPLE__ */ With 1m send buffer: % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock ~/bench/data-10g.img before: send buffer size: 8192 before: recv buffer size: 8192 after: send buffer size: 1048576 after: recv buffer size: 4194304 With 2m send buffer: % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock ~/bench/data-10g.img before: send buffer size: 8192 before: recv buffer size: 8192 after: send buffer size: 2097152 after: recv buffer size: 8388608