> > -static void __packet_set_status(struct packet_sock *po, void *frame, int 
> > status)
> > +static void __packet_set_status(struct packet_sock *po, void *frame, int 
> > status,
> > +                               bool call_complete)
> >  {
> >         union tpacket_uhdr h;
> >
> > @@ -381,6 +382,8 @@ static void __packet_set_status(struct packet_sock *po, 
> > void *frame, int status)
> >                 BUG();
> >         }
> >
> > +       if (po->wait_on_complete && call_complete)
> > +               complete(&po->skb_completion);
>
> This wake need not happen before the barrier. Only one caller of
> __packet_set_status passes call_complete (tpacket_destruct_skb).
> Moving this branch to the caller avoids a lot of code churn.
>
> Also, multiple packets may be released before the process is awoken.
> The process will block until packet_read_pending drops to zero. Can
> defer the wait_on_complete to that one instance.

Eh no. The point of having this sleep in the send loop is that
additional slots may be released for transmission (flipped to
TP_STATUS_SEND_REQUEST) from another thread while this thread is
waiting.

Else, it would have been much simpler to move the wait below the send
loop: send as many packets as possible, then wait for all of them
having been released. Much clearer control flow.

Where to set and clear the wait_on_complete boolean remains. Integer
assignment is fragile, as the compiler and processor may optimize or
move simple seemingly independent operations. As complete() takes a
spinlock, avoiding that in the DONTWAIT case is worthwhile. But probably
still preferable to set when beginning waiting and clear when calling
complete.

Reply via email to