On Tue, Aug 22, 2017 at 10:50:41AM +0800, Jason Wang wrote:
> > Perhaps the descriptor pool should also be
> > revised to allow out of order completions. Then there is no need to
> > copy zerocopy packets whenever they may experience delay.
> 
> Yes, but as replied in the referenced thread, windows driver may treat out
> of order completion as a bug.

That would be a windows driver bug then, but I don't think it makes this
assumption. What the referenced thread
(https://patchwork.kernel.org/patch/3787671/) is saying is that host
must use any buffers made available on a tx vq within a reasonable
timeframe otherwise windows guests panic.

Ideally we would detect that a packet is actually experiencing delay and
trigger the copy at that point e.g. by calling skb_linearize. But it
isn't easy to track these packets though and even harder to do a data
copy without races.

Which reminds me that skb_linearize in net core seems to be
fundamentally racy - I suspect that if skb is cloned, and someone is
trying to use the shared frags while another thread calls skb_linearize,
we get some use after free bugs which likely mostly go undetected
because the corrupted packets mostly go on wire and get dropped
by checksum code.

-- 
MST

Reply via email to