On (01/18/18 17:54), Willem de Bruijn wrote: > > 2. If we have the option of passing completion-notification up as ancillary > > data on the pollin/recvmsg channel itself (instead of MSG_ERRQUEUE) > > This assumes a somewhat symmetric workload, where there are enough recv > calls to reap the notification associated with the send calls.
Your comment about the assumption is true, but at least for the database use-cases, we have a request-response model, so the assumption works out.. I dont know if many other workloads that send large buffers have this pattern. > I would stay with MSG_ERRQUEUE processing. One option is to pass data > up to userspace in the data portion of the notification skb instead of > encoding it in ancillary data, like tcp_get_timestamping_opt_stats. that's similar to what I have, except that it does not have the MSG_PEEK part (you'd need to enforce that the data portion is upper-bounded, and that the application has the responsibility of sending down "enough" buffer with recvmsg). Note that any one of these choices are ok with me- I have no special attachments to any of them. --Sowmini