On Wed, May 9, 2018 at 3:36 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > > > On 05/09/2018 12:21 PM, Willem de Bruijn wrote: > >> Indeed. The skb shared info struct is zeroed by dev_validate_header >> as a result of dev->hard_header_len exceeding skb->end - skb->data. >> >> Not exactly sure yet how this can happen. The hard header length space >> is accounted for during allocation as reserved memory. But, >> packet_alloc_skb does call skb_reserve(), moving skb->data >> effectively beyond this reserved region. >> >> It may be incorrect to pass skb->data to dev_validate_header, as that >> does not point to the start of the ll_header anymore. Still figuring out what >> the right fix is.. >> > > I believe the bug happens if the sock_wmalloc() call at line 1921 has to > sleep. > > device can change (or at lest dev->hard_header_len can change) > > So we need to bailout if reserved/hhlen had changed. > > Or revert some patches, since dev_hold() and dev_put() are no longer high > cost, > since it is now using per cpu counter.
Oh nice, another bug :/ That seems quite plausible. This reproducer does not modify hard_header_len, however. It sends a long array of zero byte requests with sendmmsg to eventually exceed so_rcvbuf of the error queue. Hard header length is 116 throughout.