On Wed, Mar 20, 2019 at 3:48 PM Stanislav Fomichev <s...@fomichev.me> wrote:
>
> On 03/20, Willem de Bruijn wrote:
> > On Wed, Mar 20, 2019 at 3:19 PM Stanislav Fomichev <s...@fomichev.me> wrote:
> > >
> > > On 03/20, Willem de Bruijn wrote:
> > > > On Wed, Mar 20, 2019 at 3:02 PM Stanislav Fomichev <s...@fomichev.me> 
> > > > wrote:
> > > > >
> > > > > On 03/20, Willem de Bruijn wrote:
> > > > > > On Wed, Mar 20, 2019 at 12:57 PM Stanislav Fomichev 
> > > > > > <s...@fomichev.me> wrote:
> > > > > > >
> > > > > > > On 03/19, Willem de Bruijn wrote:
> > > > > > > > On Tue, Mar 19, 2019 at 6:21 PM Stanislav Fomichev 
> > > > > > > > <s...@google.com> wrote:
> > > > > > > > >
> > > > > > > > > Now that we have __flow_bpf_dissect which works on raw data 
> > > > > > > > > (by
> > > > > > > > > constructing temporary on-stack skb), use it when doing
> > > > > > > > > BPF_PROG_TEST_RUN for flow dissector.
> > > > > > > > >
> > > > > > > > > This should help us catch any possible bugs due to missing 
> > > > > > > > > shinfo on
> > > > > > > > > the per-cpu skb.
> > > > > > > > >
> > > > > > > > > Note that existing __skb_flow_bpf_dissect swallows L2 headers 
> > > > > > > > > and returns
> > > > > > > > > nhoff=0, we need to preserve the existing behavior.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Stanislav Fomichev <s...@google.com>
> > > > > > > > > ---
> > > > > > > > >  net/bpf/test_run.c | 48 
> > > > > > > > > ++++++++++++++--------------------------------
> > > > > > > > >  1 file changed, 14 insertions(+), 34 deletions(-)
> > > > > > > > >
> > > > > > > >
> > > > > > > > > @@ -300,9 +277,13 @@ int 
> > > > > > > > > bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > > > > > > > >         preempt_disable();
> > > > > > > > >         time_start = ktime_get_ns();
> > > > > > > > >         for (i = 0; i < repeat; i++) {
> > > > > > > > > -               retval = bpf_flow_dissect_skb(prog, skb,
> > > > > > > > > -                                             
> > > > > > > > > &flow_keys_dissector,
> > > > > > > > > -                                             &flow_keys);
> > > > > > > > > +               retval = bpf_flow_dissect(prog, data, 
> > > > > > > > > eth->h_proto, ETH_HLEN,
> > > > > > > > > +                                         size, 
> > > > > > > > > &flow_keys_dissector,
> > > > > > > > > +                                         &flow_keys);
> > > > > > > > > +               if (flow_keys.nhoff >= ETH_HLEN)
> > > > > > > > > +                       flow_keys.nhoff -= ETH_HLEN;
> > > > > > > > > +               if (flow_keys.thoff >= ETH_HLEN)
> > > > > > > > > +                       flow_keys.thoff -= ETH_HLEN;
> > > > > > > >
> > > > > > > > why are these conditional?
> > > > > > > Hm, I didn't want these to be negative, because bpf flow program 
> > > > > > > can set
> > > > > > > them to zero and clamp_flow_keys makes sure they are in a 
> > > > > > > "sensible"
> > > > > > > range. For this particular case, I think we need to amend
> > > > > > > clamp_flow_keys to make sure that flow_keys.nhoff is in the range 
> > > > > > > of
> > > > > > > initial_nhoff..hlen, not 0..hlen (and then we can drop these 
> > > > > > > checks).
> > > > > >
> > > > > > So, previously eth_type_trans would call with data at the network
> > > > > > header. Now it is called with data at the link layer. How would
> > > > > > __skb_flow_bpf_dissect "swallows L2 headers and returns nhoff=0"? 
> > > > > > That
> > > > > s/__skb_flow_bpf_dissect/eth_type_trans/, I'll clarify that in the 
> > > > > patch
> > > > > description.
> > > > >
> > > > > > sounds incorrect.
> > > > > Previously, for skb case, eth_type_trans would pull ETH_HLEN (L2) and
> > > > > after that we did skb_reset_network_header. So when later we 
> > > > > initialized
> > > > > flow keys (flow_keys->nhoff = skb_network_offset(skb)), that would
> > > > > yield nhoff == 0.
> > > > >
> > > > > For example, see:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > > > >
> > > > > Now, we explicitly call bpf_flow_dissect with nhoff=ETH_HLEN and have 
> > > > > to
> > > > > undo it, otherwise, it breaks those tests.
> > > > >
> > > > > We could do something like the following instead:
> > > > > retval = bpf_flow_dissect(prog, data + ETH_HLEN, eth->h_proto, 0,
> > > > >                           size, &flow_keys_dissector,
> > > > >                           &flow_keys);
> > > > >
> > > > > But I wanted to make sure nhoff != 0 works.
> > > >
> > > > Makes sense. Ensuring that nhoff lies within initial_nhoff..hlen
> > > > sounds correct to me. But this is a limitation of the test, so should
> > > > be in the test logic, not in the generic clamp code. Perhaps just fail
> > > > the test if returned nhoff < ETH_HLEN?
> > > I don't think it's only about the tests. BPF program can return
> > > nhoff/thoff out of range as well (if there was some bug in its logic,
> > > for example). We should not blindly trust whatever it returns, right?
> >
> > Definitely. That's why we clamp. I'm not sure that we have to restrict
> > the minimum offset to initial nhoff, however.
> Makes sense. TBH, only the tests currently care about nhoff that flow
> dissector returns. In the kernel we use only thoff from bpf flow dissector
> and ignore any modifications to the nhoff.
>
> Do you think there is a usecase for nhoff possibly going backwards?
> In other words, why not prohibit that from the beginning and set the
> expectations strait (i.e. nhoff only grows).

Fair point. That is how the non bpf flow dissector works. And if the
initial offset is always sensible, indeed I see no reasonable case
where the program would return a lower value. I was a bit concerned
about that precondition. In practice, all but one caller passes 0 and
data at network header, where this discussion is moot. And the
exception is eth_get_headlen which hardcodes ETH_HLEN. Given that, y
our original suggestion to adjust the clamp function SGTM.

Reply via email to