On Mon, Dec 21, 2020 at 05:52:08PM -0500, Willem de Bruijn wrote: > > > > - marking the skb data as inaccessible so skb_condense() > > > > and skb_zeroocopy_clone() leave it alone. > > > > > > Yep. Skipping content access on the Rx path will be interesting. I > > > wonder if that should be a separate opaque skb feature, independent > > > from whether the data is owned by userspace, peripheral memory, the > > > page cache or anything else. > > > > Would that be indicated by a bit on the skb (like pfmemalloc), or > > a bit in the skb_shared structure, as I'm leaning towards doing here? > > I would guide it in part by avoiding cold cacheline accesses. That > might be hard if using skb_shinfo. OTOH, you don't have to worry about > copying the bit during clone operations. > > > > > > If anything, eating up the last 8 bits in skb_shared_info should be > > > > > last resort. > > > > > > > > I would like to add 2 more bits in the future, which is why I > > > > moved them. Is there a compelling reason to leave the bits alone? > > > > > > Opportunity cost. > > > > > > We cannot grow skb_shared_info due to colocation with MTU sized linear > > > skbuff's in half a page. > > > > > > It took me quite some effort to free up a few bytes in commit > > > 4d276eb6a478 ("net: remove deprecated syststamp timestamp"). > > > > > > If we are very frugal, we could shadow some bits to have different > > > meaning in different paths. SKBTX_IN_PROGRESS is transmit only, I > > > think. But otherwise we'll have to just dedicate the byte to more > > > flags. Yours are likely not to be the last anyway. > > > > The zerocopy/enable flags could be encoded in one of the lower 3 bits > > in the destructor_arg, (similar to nouarg) but that seems messy. > > Agreed :) > > Let's just expand the flags for now. It may be better to have one > general purpose 16 bit flags bitmap, rather than reserving 8 bits > specifically to zerocopy features.
I was considering doing that also, but that would need to rearrange the flags in skb_shared_info. Then I realized that there are currently only TX flags and ZC flags, so went with that. I have no objections to doing it either way. My motivation here is when MSG_ZCTAP is added to tcp_sendmsg_locked(), it the returned uarg is self-contained for the rest of the function. -- Jonathan