Thanks Kenton, I missed that writeMessage() just passes along raw pointers to segments in a call to writev().
If the application uses an IPC like say, message queues instead of sockets, then it would not be able to use writeMessageToFd() interface. Is that a scenario where the system should be falling back to application level framing and reconstructing for sending and receiving multi-segment messages? Thanks - On Saturday, July 20, 2019 at 12:27:01 PM UTC-7, Kenton Varda wrote: > > Hi Sune, > > On Sat, Jul 20, 2019 at 12:02 PM Sune Sash <[email protected] > <javascript:>> wrote: > >> >> Thanks to both of you. >> >> I see that the writeMessage() in serialize.h creates a segment table and >> copies over the individual segments. So I presume, in order to send a >> multi-segment without incurring a copy, the application >> would have to forego using the interfaces in serialize.h and frame the >> segments with a segment table on its own similar to what writeMessage() >> does. >> > > No, writeMessage() does *not* make copies of the segments. It passes > pointers to the original segment memory locations down into writev(). > > The writev() call itself makes a copy of the data into kernel buffers, but > no copies are made in userspace. > > Is there a way for the application to send all segments as one individual >> message over, say a socket, or would the application need to send >> multiple messages and >> reconstruct at the receiving end? I presume that as Kenton's response >> indicates for a read, the write cannot be truly zero copy if the message >> needs >> to be ultimately sent over a socket/queue. The copy that is saved by not >> using writeMessage would be coalescing of multiple segments into one long >> segment. Is that an accurate understanding? >> > > If you want true end-to-end zero-copy -- even in the kernel -- then you > need to map a shared memory segment into both the sending and receiving > processes. In this case you would not use writeMessage(); you would use a > MessageBuilder that allocates segments directly in the shared memory area. > > But, assuming you don't want to use shared memory and want to stick with > sockets, then writeMessage() is optimal. > > -Kenton > > >> >> Thanks >> - >> >> On Saturday, July 20, 2019 at 11:40:22 AM UTC-7, Kenton Varda wrote: >>> >>> Ian is almost right. It's: >>> >>> 1. read() first 8 bytes, which contains the number of segments and size >>> of the first segment. >>> 2. (Only if more than 1 segment) read() the rest of the segment table. >>> 3. read() the entire message content (all segments) into one big array. >>> >>> So in the case of a single-segment message, it's actually two syscalls. >>> >>> Of course, read() implies a copy -- from kernel buffers to userspace. So >>> this is not truly zero-copy in that sense. However, once the data is read >>> in from the kernel, it can then be operated on with no further copies. >>> >>> For true zero-copy, you need to use mmap() (for files) or shared memory >>> (for inter-process communication). >>> >>> Over a normal IP network, zero-copy input is probably impossible, >>> because the individual packets need to land in a temporary buffer in order >>> for the kernel to be able to inspect their headers and find out which >>> socket they are destined for. There's typically no way for the network card >>> to deliver TCP packets directly to the final buffer. If you have high-end >>> RDMA network hardware, that might be a different story. >>> >>> -Kenton >>> >>> On Sat, Jul 20, 2019 at 11:28 AM Ian Denhardt <[email protected]> wrote: >>> >>>> Haven't looked at the code for the C++ implementation, but based on my >>>> knowledge of the wire format[1] I would assume: >>>> >>>> 1. read() 4 bytes to get the number of segments >>>> 2. read() the list of segment sizes >>>> 3. readv() to read in all the segments >>>> >>>> [1]: https://capnproto.org/encoding.html#serialization-over-a-stream >>>> >>>> Quoting Sune Sash (2019-07-20 13:43:43) >>>> > Hello >>>> > I am new to cap'n'proto and came across this comment in >>>> serialize.h.� >>>> > "A multi-segment message can be read entirely in three system calls >>>> > with no buffering." >>>> > What are the 3 system calls involved? Also, I would like to >>>> understand >>>> > if this statement is true under zero-copy semantics. >>>> > Thanks >>>> > Shweta >>>> > >>>> > -- >>>> > You received this message because you are subscribed to the Google >>>> > Groups "Cap'n Proto" group. >>>> > To unsubscribe from this group and stop receiving emails from it, >>>> send >>>> > an email to [1][email protected]. >>>> > To view this discussion on the web visit >>>> > [2] >>>> https://groups.google.com/d/msgid/capnproto/92d0c205-d5cc-4ecd-b1ff- >>>> > f514a0aa49c7%40googlegroups.com. >>>> > >>>> > Verweise >>>> > >>>> > 1. mailto:[email protected] >>>> > 2. >>>> https://groups.google.com/d/msgid/capnproto/92d0c205-d5cc-4ecd-b1ff-f514a0aa49c7%40googlegroups.com?utm_medium=email&utm_source=footer >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Cap'n Proto" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/capnproto/156364702914.5369.4249645648625880523%40localhost.localdomain >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Cap'n Proto" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/capnproto/f719008d-ec2f-4ee1-8bd0-d223434114d7%40googlegroups.com >> >> <https://groups.google.com/d/msgid/capnproto/f719008d-ec2f-4ee1-8bd0-d223434114d7%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/5ce13b79-6a1c-4e6d-b8ab-ecee7e664a79%40googlegroups.com.
