On Mon, 26 Apr 2021 13:14:48 +0200 Ralph Schmieder <ralph.schmie...@gmail.com> wrote:
> > On Apr 23, 2021, at 18:39, Stefano Brivio <sbri...@redhat.com> > > wrote: > > > > [...] > > > > Okay, so it doesn't seem to fit your case, but this specific point > > is where you actually have a small advantage using a stream-oriented > > socket. If you receive a packet and have a smaller receive buffer, > > you can read the length of the packet from the vnet header and then > > read the rest of the packet at a later time. > > > > With a datagram-oriented socket, you would need to know the maximum > > packet size in advance, and use a receive buffer that's large > > enough to contain it, because if you don't, you'll discard data. > > For me, the maximum packet size is a jumbo frame (e.g. 9x1024) anyway > -- everything must fit into an atomic write of that size. Well, the day you want to do some batching... ;) but sure, I see your point. > > [...] > > > > On a side note, I wonder why you need two named sockets instead of > > one -- I mean, they're bidirectional... > > Hmm... each peer needs to send unsolicited frames/packets to the > other end... and thus needs to bind to their socket. Pretty much for > the same reason as the UDP transport requires you to specify a local > and a remote 5-tuple. Even though for AF_INET, the local port does > not have to be specified, the OS would assign an ephemeral port to > make it unique. Am I missing something? I see your point now. Well, I think it's different from the AF_INET case due to the way AF_UNIX works: UNIX domain sockets don't necessarily need to make the endpoint known or visible, see a more detailed explanation at: https://comp.unix.admin.narkive.com/AhAOKP1s/lsof-find-both-endpoints-of-a-unix-socket Even though, nowadays on Linux: $ nc -luU my_path & (sleep 1; nc.openbsd -uU my_path & lsof +E -aUc nc) [1] 373285 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 373285 sbrivio 3u unix 0x000000004076431a 0t0 3957568 my_path type=DGRAM ->INO=3956394 373288,nc.openbs,4u nc.openbs 373288 sbrivio 4u unix 0x00000000f5b2e2e1 0t0 3956394 /tmp/nc.XXXXC0whUu type=DGRAM ->INO=3957568 373285,nc,3u for datagram sockets, the endpoint is exported, and lsof can report that the endpoint for "my_path" here (-luU binds to a UNIX domain datagram socket, -uU connects to it). With a stream socket, by the way: $ nc -lU my_path & (sleep 1; nc.openbsd -U my_path & lsof +E -aUc nc) [1] 375445 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 375445 sbrivio 3u unix 0x0000000053abf57c 0t0 3969787 my_path type=STREAM nc 375445 sbrivio 4u unix 0x000000001960c1ef 0t0 3969788 my_path type=STREAM ->INO=3970624 375448,nc.openbs,3u nc.openbs 375448 sbrivio 3u unix 0x000000000538fa63 0t0 3970624 type=STREAM ->INO=3969788 375445,nc,4u so I think it should be optional. Even with datagram sockets, just like the example above (I'm not suggesting that you do this, it's just another possible choice), only one peer needs to bind to a named socket, and yet they can exchange data. > Another thing: on Windows, there's a AF_UNIX/SOCK_STREAM > implementation... So, technically it should be possible to use that > code path on Windows, too. Not a windows guy, though... So, can't > say whether it would simply work or not: > > https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/ Thanks for the pointer. I can't test this, so I wouldn't remove that #ifndef, but perhaps I could add a link to this, in case somebody needs it and stumbles upon this code path. -- Stefano