On Fri, Aug 14, 2020 at 10:17 AM linmiaohe <linmia...@huawei.com> wrote: > > Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > >On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmia...@huawei.com> wrote: > >> > >> The var extra_uref is introduced to pass the initial reference taken > >> in sock_zerocopy_alloc to the first generated skb. But now we may fail > >> to pass the initial reference with newly allocated UDP or RAW uarg > >> when the skb is zcopied. > > > >extra_uref is true if there is no previous skb to append to or there is a > >previous skb, but that does not have zerocopy data associated yet (because > >the previous call(s) did not set MSG_ZEROCOPY). > > > >In other words, when first (allocating and) associating a zerocopy struct > >with the skb. > > Many thanks for your explaination. The var extra_uref plays the role as you > say. I just borrowed the description of var extra_uref from previous commit > log here. > > > > >> - extra_uref = !skb_zcopy(skb); /* only ref on new uarg */ > >> + /* Only ref on newly allocated uarg. */ > >> + if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && > >> skb_zcopy(skb) != uarg)) > >> + extra_uref = true; > > > >SOCK_STREAM does not use __ip_append_data. > > > >This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg. > > > >This function can only acquire a uarg through sock_zerocopy_realloc, which > >on skb_zcopy(skb) only returns the existing uarg or NULL (for not > >SOCK_STREAM). > > > >So I don't see when that condition can happen. > > > > On skb_zcopy(skb), we returns the existing uarg iff (uarg->id + uarg->len == > atomic_read(&sk->sk_zckey)) in sock_zerocopy_realloc. So we may get a newly > allocated > uarg via sock_zerocopy_alloc(). Though we may not trigger this codepath now, > it's still a potential problem that we may missed the right trace to uarg.
I don't think that can happen. The question is when this branch is false next = (u32)atomic_read(&sk->sk_zckey); if ((u32)(uarg->id + uarg->len) == next) { I cannot come up with a case. I think it might be vestigial. The goal is to ensure to append only a consecutive range of notification IDs. Each notification ID corresponds to a sendmsg invocation with MSG_ZEROCOPY. In both TCP and UDP with corking, data is ordered and access to changes to these fields happen together as a transaction: /* realloc only when socket is locked (TCP, UDP cork), * so uarg->len and sk_zckey access is serialized */