On 02/05, Stanislav Fomichev wrote:
> On 02/05, Alexei Starovoitov wrote:
> > On Tue, Feb 05, 2019 at 07:56:19PM -0800, Stanislav Fomichev wrote:
> > > On 02/05, Alexei Starovoitov wrote:
> > > > On Tue, Feb 05, 2019 at 04:59:31PM -0800, Stanislav Fomichev wrote:
> > > > > On 02/05, Alexei Starovoitov wrote:
> > > > > > On Tue, Feb 05, 2019 at 12:40:03PM -0800, Stanislav Fomichev wrote:
> > > > > > > On 02/05, Willem de Bruijn wrote:
> > > > > > > > On Tue, Feb 5, 2019 at 12:57 PM Stanislav Fomichev 
> > > > > > > > <s...@google.com> wrote:
> > > > > > > > >
> > > > > > > > > Currently, when eth_get_headlen calls flow dissector, it 
> > > > > > > > > doesn't pass any
> > > > > > > > > skb. Because we use passed skb to lookup associated 
> > > > > > > > > networking namespace
> > > > > > > > > to find whether we have a BPF program attached or not, we 
> > > > > > > > > always use
> > > > > > > > > C-based flow dissector in this case.
> > > > > > > > >
> > > > > > > > > The goal of this patch series is to add new networking 
> > > > > > > > > namespace argument
> > > > > > > > > to the eth_get_headlen and make BPF flow dissector programs 
> > > > > > > > > be able to
> > > > > > > > > work in the skb-less case.
> > > > > > > > >
> > > > > > > > > The series goes like this:
> > > > > > > > > 1. introduce __init_skb and __init_skb_shinfo; those will be 
> > > > > > > > > used to
> > > > > > > > >    initialize temporary skb
> > > > > > > > > 2. introduce skb_net which can be used to get networking 
> > > > > > > > > namespace
> > > > > > > > >    associated with an skb
> > > > > > > > > 3. add new optional network namespace argument to 
> > > > > > > > > __skb_flow_dissect and
> > > > > > > > >    plumb through the callers
> > > > > > > > > 4. add new __flow_bpf_dissect which constructs temporary 
> > > > > > > > > on-stack skb
> > > > > > > > >    (using __init_skb) and calls BPF flow dissector program
> > > > > > > > 
> > > > > > > > The main concern I see with this series is this cost of skb 
> > > > > > > > zeroing
> > > > > > > > for every packet in the device driver receive routine, 
> > > > > > > > *independent*
> > > > > > > > from the real skb allocation and zeroing which will likely 
> > > > > > > > happen
> > > > > > > > later.
> > > > > > > Yes, plus ~200 bytes on the stack for the callers.
> > > > > > > 
> > > > > > > Not sure how visible this zeroing though, I can probably try to 
> > > > > > > get some
> > > > > > > numbers from BPF_PROG_TEST_RUN (running current version vs 
> > > > > > > running with
> > > > > > > on-stack skb).
> > > > > > 
> > > > > > imo extra 256 byte memset for every packet is non starter.
> > > > > We can put pre-allocated/initialized skbs without data into percpu or 
> > > > > even
> > > > > use pcpu_freelist_pop/pcpu_freelist_push to make sure we don't have 
> > > > > to think
> > > > > about having multiple percpu for irq/softirq/process contexts.
> > > > > Any concerns with that approach?
> > > > > Any other possible concerns with the overall series?
> > > > 
> > > > I'm missing why the whole thing is needed.
> > > > You're saying:
> > > > " make BPF flow dissector programs be able to work in the skb-less 
> > > > case".
> > > > What does it mean specifically?
> > > > The only non-skb case is XDP.
> > > > Are you saying you want flow_dissector prog to be run in XDP?
> > > eth_get_headlen that drivers call on RX path on a chunk of data to
> > > guesstimate the length of the headers calls flow dissector without an skb
> > > (__skb_flow_dissect was a weird interface where it accepts skb or
> > > data+len). Right now, there is no way to trigger BPF flow dissector
> > > for this case (we don't have an skb to get associated namespace/etc/etc).
> > > The patch series tries to fix that to make sure that we always trigger
> > > BPF program if it's attached to a device's namespace.
> > 
> > then why not to create flow_dissector prog type that works without skb?
> > Why do you need to fake an skb?
> > XDP progs work just fine without it.
> What's the advantage of having another prog type? In this case we would have
> to write the same flow dissector program twice: first time against __skb_buff
> interface, second time against xdp_md.
> By using fake skb, we make the same flow dissector __sk_buff BPF program
> work in both contexts without a rewrite to an xdp interface (I don't
> think users should care whether flow dissector was called form "xdp" vs skb
> context; and we're sort of stuck with __sk_buff interface already).
Should I follow up with v2 where I address memset(,,256) for each packet?
Or you still have some questions/doubts/suggestions regarding the problem
I'm trying to solve?

Reply via email to