Re: [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

Alexei Starovoitov Thu, 13 Sep 2018 15:23:33 -0700

On Thu, Sep 13, 2018 at 02:24:03PM -0700, Joe Stringer wrote:
> On Thu, 13 Sep 2018 at 14:22, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> >
> > On Thu, Sep 13, 2018 at 02:17:17PM -0700, Joe Stringer wrote:
> > > On Thu, 13 Sep 2018 at 14:02, Alexei Starovoitov
> > > <alexei.starovoi...@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 13, 2018 at 01:55:01PM -0700, Joe Stringer wrote:
> > > > > On Thu, 13 Sep 2018 at 12:06, Alexei Starovoitov
> > > > > <alexei.starovoi...@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Sep 12, 2018 at 5:06 PM, Alexei Starovoitov
> > > > > > <alexei.starovoi...@gmail.com> wrote:
> > > > > > > On Tue, Sep 11, 2018 at 05:36:36PM -0700, Joe Stringer wrote:
> > > > > > >> This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and
> > > > > > >> bpf_sk_lookup_udp() which allows BPF programs to find out if 
> > > > > > >> there is a
> > > > > > >> socket listening on this host, and returns a socket pointer 
> > > > > > >> which the
> > > > > > >> BPF program can then access to determine, for instance, whether 
> > > > > > >> to
> > > > > > >> forward or drop traffic. bpf_sk_lookup_xxx() may take a 
> > > > > > >> reference on the
> > > > > > >> socket, so when a BPF program makes use of this function, it must
> > > > > > >> subsequently pass the returned pointer into the newly added 
> > > > > > >> sk_release()
> > > > > > >> to return the reference.
> > > > > > >>
> > > > > > >> By way of example, the following pseudocode would filter inbound
> > > > > > >> connections at XDP if there is no corresponding service 
> > > > > > >> listening for
> > > > > > >> the traffic:
> > > > > > >>
> > > > > > >>   struct bpf_sock_tuple tuple;
> > > > > > >>   struct bpf_sock_ops *sk;
> > > > > > >>
> > > > > > >>   populate_tuple(ctx, &tuple); // Extract the 5tuple from the 
> > > > > > >> packet
> > > > > > >>   sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0);
> > > > > > > ...
> > > > > > >> +struct bpf_sock_tuple {
> > > > > > >> +     union {
> > > > > > >> +             __be32 ipv6[4];
> > > > > > >> +             __be32 ipv4;
> > > > > > >> +     } saddr;
> > > > > > >> +     union {
> > > > > > >> +             __be32 ipv6[4];
> > > > > > >> +             __be32 ipv4;
> > > > > > >> +     } daddr;
> > > > > > >> +     __be16 sport;
> > > > > > >> +     __be16 dport;
> > > > > > >> +     __u8 family;
> > > > > > >> +};
> > > > > > >
> > > > > > > since we can pass ptr_to_packet into map lookup and other helpers 
> > > > > > > now,
> > > > > > > can you move 'family' out of bpf_sock_tuple and combine with 
> > > > > > > netns_id arg?
> > > > > > > then progs wouldn't need to copy bytes from the packet into tuple
> > > > > > > to do a lookup.
> > > > >
> > > > > If I follow, you're proposing that users should be able to pass a
> > > > > pointer to the source address field of the L3 header, and assuming
> > > > > that the L3 header ends with saddr+daddr (no options/extheaders), and
> > > > > is immediately followed by the sport/dport then a packet pointer
> > > > > should work for performing socket lookup. Then it is up to the BPF
> > > > > program writer to ensure that this is the case, or otherwise fall back
> > > > > to populating a copy of the sock tuple on the stack.
> > > >
> > > > yep.
> > > >
> > > > > > have been thinking more about it.
> > > > > > since only ipv4 and ipv6 supported may be use size of bpf_sock_tuple
> > > > > > to infer family inside the helper, so it doesn't need to be passed 
> > > > > > explicitly?
> > > > >
> > > > > Let me make sure I understand the proposal here.
> > > > >
> > > > > The current structure and function prototypes are:
> > > > >
> > > > > struct bpf_sock_tuple {
> > > > >       union {
> > > > >               __be32 ipv6[4];
> > > > >               __be32 ipv4;
> > > > >       } saddr;
> > > > >       union {
> > > > >               __be32 ipv6[4];
> > > > >               __be32 ipv4;
> > > > >       } daddr;
> > > > >       __be16 sport;
> > > > >       __be16 dport;
> > > > >       __u8 family;
> > > > > };
> > > > ...
> > > > > You're proposing something like:
> > > > >
> > > > > struct bpf_sock_tuple4 {
> > > > >       __be32 saddr;
> > > > >       __be32 daddr;
> > > > >       __be16 sport;
> > > > >       __be16 dport;
> > > > >       __u8 family;
> > > > > };
> > > > >
> > > > > struct bpf_sock_tuple6 {
> > > > >       __be32 saddr[4];
> > > > >       __be32 daddr[4];
> > > > >       __be16 sport;
> > > > >       __be16 dport;
> > > > >       __u8 family;
> > > > > };
> > > >
> > > > I think the split is unnecessary.
> > > > I'm proposing:
> > > > struct bpf_sock_tuple {
> > > >       union {
> > > >               __be32 ipv6[4];
> > > >               __be32 ipv4;
> > > >       } saddr;
> > > >       union {
> > > >               __be32 ipv6[4];
> > > >               __be32 ipv4;
> > > >       } daddr;
> > > >       __be16 sport;
> > > >       __be16 dport;
> > > > };
> > > >
> > > > that points directly into the packet (when ipv4 options are not there)
> > > > and bpf_sk_lookup_tcp() uses 'size' argument to figure out ipv4/ipv6 
> > > > family.
> > >
> > > Needs to be subtly different, the 'sport'/'dport' offset would be
> > > wrong in the IPv4 case otherwise:
> >
> > ahh. right.
> >
> > >
> > > We could take my definitions above and do the following if we want to
> > > try to type the helper definition:
> > >
> > > union bpf_sock_tuple {
> > >        struct bpf_sock_tuple4 t4;
> > >        struct bpf_sock_tuple6 t6;
> > > };
> >
> > yes. sounds great to me. Much better than 'void *' in the helper.
> 
> Could even do something like this:
> 
> $ cat foo.c
> #include <linux/types.h>
> 
> struct bpf_sock_tuple {
>    union {
>    struct {
>        __be32 saddr;
>        __be32 daddr;
>        __be16 sport;
>        __be16 dport;
>    } ipv4;
>    struct {
>        __be32 saddr[4];
>        __be32 daddr[4];
>        __be16 sport;
>        __be16 dport;
>    } ipv6;
>    };
> };


both solutions look fine.
I'd go with whichever one is cleaner looking from bpf prog pov.
Both probably need some casting of skb->data pointer.

Re: [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

Reply via email to