On (01/18/18 18:09), Willem de Bruijn wrote: > If that is true in general for PF_RDS, then it is a reasonable approach. > How about treating it as a (follow-on) optimization path. Opportunistic > piggybacking of notifications on data reads is more widely applicable.
sounds good. > > that's similar to what I have, except that it does not have the > > MSG_PEEK part (you'd need to enforce that the data portion > > is upper-bounded, and that the application has the responsibility > > of sending down "enough" buffer with recvmsg). > > Right. I think that an upper bound is the simplest solution here. > > By the way, if you allocate an skb immediately on page pinning, then > there are always sufficient skbs to store all notifications. On errqueue > enqueue just drop the new skb and copy its notification to the body of > the skb already on the queue, if one exists and it has room. That is > essentially what the tcp zerocopy code does with the [data, info] range. ok, I'll give that a shot (I'm working through the other review comments as well) fwiw, the data-corruption issue I mentioned turned out to be a day-one bug in rds-tcp (patched in http://patchwork.ozlabs.org/patch/863183/). The buffer reaping with zcopy (and aggressiveness of rds-stress) brought this one out.. --Sowmini