On Tue, Mar 5, 2019 at 4:03 PM Arthur Kepner <arthur.kep...@riverbed.com> wrote: > > > The attachment contains an UNTESTED patch (though a similar patch was tested > with a > 3.10 kernel). > > We've been chasing a bug where packet corruption is seen on a tap device. We > have a > PACKET_MMAP socket which is bound to a tap interface. When throughput goes > above a > threshold, we begin to see that packets received on the tap device are > truncated, or > otherwise corrupted. We found that when packets are enqueued to the tap > device, they > are fine, but by the time they are read, they can be corrupted. > > And we found that simply deferring the call to skb_orphan() (where the > destructor, > tpacket_destruct_skb() marks the frame as TP_STATUS_AVAILABLE) fixes the > problem. > > Maybe there's a better fix, but this worked for us. Thoughts? (Please CC me > on replies - I'm not > subscribed.)
The skb_orphan calls tpacket_destruct_skb, which updates the entry in the packet ring to TP_STATUS_AVAILABLE. Thanks for the report and suggested fix. Delaying the call to skb_orphan reduces the race condition between release and read, but does not fully remove it. As of commit 5cd8d46ea156 ("packet: copy user buffers before orphan or clone") in 4.20 this should no longer be an issue. That reuses the msg_zerocopy infrastructure also for packet ring packets with shared memory. And creates a private copy whenever these may be looped to a local destination that may queue indefinitely, like tun.