On Sun, Jun 23, 2019 at 7:40 AM Neil Horman <nhor...@tuxdriver.com> wrote: > > On Sat, Jun 22, 2019 at 10:21:31PM -0400, Willem de Bruijn wrote: > > > > -static void __packet_set_status(struct packet_sock *po, void *frame, > > > > int status) > > > > +static void __packet_set_status(struct packet_sock *po, void *frame, > > > > int status, > > > > + bool call_complete) > > > > { > > > > union tpacket_uhdr h; > > > > > > > > @@ -381,6 +382,8 @@ static void __packet_set_status(struct packet_sock > > > > *po, void *frame, int status) > > > > BUG(); > > > > } > > > > > > > > + if (po->wait_on_complete && call_complete) > > > > + complete(&po->skb_completion); > > > > > > This wake need not happen before the barrier. Only one caller of > > > __packet_set_status passes call_complete (tpacket_destruct_skb). > > > Moving this branch to the caller avoids a lot of code churn. > > > > > > Also, multiple packets may be released before the process is awoken. > > > The process will block until packet_read_pending drops to zero. Can > > > defer the wait_on_complete to that one instance. > > > > Eh no. The point of having this sleep in the send loop is that > > additional slots may be released for transmission (flipped to > > TP_STATUS_SEND_REQUEST) from another thread while this thread is > > waiting. > > > Thats incorrect. The entirety of tpacket_snd is protected by a mutex. No > other > thread can alter the state of the frames in the vector from the kernel send > path > while this thread is waiting.
I meant another user thread updating the memory mapped ring contents. > > Else, it would have been much simpler to move the wait below the send > > loop: send as many packets as possible, then wait for all of them > > having been released. Much clearer control flow. > > > Thats (almost) what happens now. The only difference is that with this > implementation, the waiting thread has the opportunity to see if userspace has > queued more frames for transmission during the wait period. We could > potentially change that, but thats outside the scope of this fix. Agreed. I think the current, more complex, behavior was intentional. We could still restructure to move it out of the loop and jump back. But, yes, definitely out of scope for a fix. > > Where to set and clear the wait_on_complete boolean remains. Integer > > assignment is fragile, as the compiler and processor may optimize or > > move simple seemingly independent operations. As complete() takes a > > spinlock, avoiding that in the DONTWAIT case is worthwhile. But probably > > still preferable to set when beginning waiting and clear when calling > > complete. > We avoid any call to wait_for_complete or complete already, based on the > gating > of the need_wait variable in tpacket_snd. If the transmitting thread doesn't > set MSG_DONTWAIT in the flags of the msg structure, we will never set > wait_for_complete, and so we will never manipulate the completion queue. But we don't know the state of this at tpacket_destruct_skb time without wait_for_completion?