> On 18 Apr 2025, at 21:55, Eric Blake <ebl...@redhat.com> wrote:
> 
> On Fri, Apr 18, 2025 at 05:24:36PM +0300, Nir Soffer wrote:
>> Testing with qemu-nbd shows that computing a hash of an image via
>> qemu-nbd is 5-7 times faster with this change.
>> 
> 
>> +++ b/io/channel-socket.c
>> @@ -410,6 +410,19 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
>>     }
>> #endif /* WIN32 */
>> 
>> +#if __APPLE__
>> +    /* On macOS we need to tune unix domain socket buffer for best 
>> performance.
>> +     * Apple recommends sizing the receive buffer at 4 times the size of the
>> +     * send buffer.
>> +     */
>> +    if (cioc->localAddr.ss_family == AF_UNIX) {
>> +        const int sndbuf_size = 1024 * 1024;
>> +        const int rcvbuf_size = 4 * sndbuf_size;
>> +        setsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, 
>> sizeof(sndbuf_size));
>> +        setsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, 
>> sizeof(rcvbuf_size));
>> +    }
>> +#endif /* __APPLE__ */
> 
> Why does this have to be limited?  On linux, 'man 7 unix' documents
> that SO_SNDBUF is honored (SO_RCVBUF is silently ignored but accepted
> for compatibility).  On the other hand, 'man 7 socket' states that it
> defaults to the value in /proc/sys/net/core/wmem_default (212992 on my
> machine) and cannot exceed the value in /proc/sys/net/core/wmem_max
> without CAP_NET_ADMIN privileges (also 212992 on my machine).
> 
> Of course, Linux and MacOS are different kernels, so your effort to
> set it to 1M may actually be working on Apple rather than being
> silently cut back to the enforced maximum.

Testing shows values up to 2m send buffer, 8m receive buffer shows the values 
changes the performance, so they are not silently clipped.

> And the fact that raising
> it at all makes a difference merely says that unlike Linux (where the
> default appears to already be as large as possible), Apple is set up
> to default to a smaller buffer (more fragmentation requires more
> time), and bumping to the larger value improves performance.  But can
> you use getsockopt() prior to your setsockopt() to see what value
> Apple was defaulting to, and then again afterwards to see whether it
> actually got as large as you suggested?

Sure, tested with:

diff --git a/io/channel-socket.c b/io/channel-socket.c
index b858659764..9600a076be 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -418,8 +418,21 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
     if (cioc->localAddr.ss_family == AF_UNIX) {
         const int sndbuf_size = 1024 * 1024;
         const int rcvbuf_size = 4 * sndbuf_size;
+        int value;
+        socklen_t value_size = sizeof(value);
+
+        getsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &value, &value_size);
+        fprintf(stderr, "before: send buffer size: %d\n", value);
+        getsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &value, &value_size);
+        fprintf(stderr, "before: recv buffer size: %d\n", value);
+
         setsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &sndbuf_size, 
sizeof(sndbuf_size));
         setsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf_size, 
sizeof(rcvbuf_size));
+
+        getsockopt(cioc->fd, SOL_SOCKET, SO_SNDBUF, &value, &value_size);
+        fprintf(stderr, "after: send buffer size: %d\n", value);
+        getsockopt(cioc->fd, SOL_SOCKET, SO_RCVBUF, &value, &value_size);
+        fprintf(stderr, "after: recv buffer size: %d\n", value);
     }
 #endif /* __APPLE__ */
 
With 1m send buffer:

% ./qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock ~/bench/data-10g.img
before: send buffer size: 8192
before: recv buffer size: 8192
after: send buffer size: 1048576
after: recv buffer size: 4194304

With 2m send buffer:

% ./qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock ~/bench/data-10g.img
before: send buffer size: 8192
before: recv buffer size: 8192
after: send buffer size: 2097152
after: recv buffer size: 8388608


Reply via email to