xdp via common intermediate layer

Daniel Borkmann Tue, 06 Mar 2018 06:23:20 -0800

On 03/04/2018 08:40 PM, Florian Westphal wrote:
> These patches, which go on top of the 'bpfilter' RFC patches,
> demonstrate an nftables to ebpf translation (done in userspace).
> In order to not duplicate the ebpf code generation efforts, the rules
> 
> iptables -i lo -d 127.0.0.2 -j DROP
> and
> nft add rule ip filter input ip daddr 127.0.0.2 drop
> 
> are first translated to a common intermediate representation, and then
> to ebpf, which attaches resulting prog to the XDP hook.
> 
> IMR representation is identical in both cases so therefore both
> rules result in the same ebpf program.
> 
> The IMR currently assumes that translation will always be to ebpf.
> As per previous discussion it doesn't consider other targets, so
> for instance IMR pseudo-registers map 1:1 to ebpf ones.
> 
> The IMR is also supposed to be generic enough to make it easy to convert
> 'fronted' formats (iptables rule blob, nftables netlink) to it, and
> also extend it to cover ip rule, ovs or any other inputs in the future
> without need for major changes to the IMR.
> 
> The IMR currently implements following basic operations:
>  - Relational (equal, not equal)
>  - immediates (32 and 64bit constants)
>  - payload with relative addressing (macr, network, transport header)
>  - verdict (pass, drop, next rule)
> 
> Its still in early stage, but I think its good enough as
> a proof-of-concept.


Thanks a lot for working on this! Overall I like the PoC and the
underlying idea of it! I think the design of such IMR would indeed
be the critical part in that it needs to be flexible enough to cover
both front ends well enough without having to make compromises to
one of them. The same would be for optimization passes e.g. when we
know that two successive rules would match on TCP header bits that we
can reuse the register that loaded/parsed it previously to that point.
Similar when it comes to maps when the lookup value would need to
propagate through the linked imr objects. Do you see such optimizations
or in general propagation of state as direct part of the IMR or rather
somewhat hidden in IMR layer when doing the IMR to BPF 'jit' phase?

Which other parts do you think would be needed for the IMR aside
from above basic operations? ALU ops, JMPs for the relational ops?
I think it would be good to have clear semantics in terms of what
it would eventually abstract away from raw BPF when sitting on top
of it; potentially these could be details on packet access or
interaction with helpers or other BPF features such that it's BPF prog
type independent at this stage, e.g. another point could be that given
the priv/cap level of the uapi request, there could also be different
BPF code gen backends that implement against the IMR, e.g. when the
request comes out of userns then it has feature constraints in terms
of e.g. having to use bpf_skb_{load,store}_bytes() helpers for packet
access instead of direct packet access or not being able to use BPF to
BPF calls, etc; wdyt?

> Known differences between nftjit.ko and bpfilter.ko:
> nftjit.ko currently doesn't run transparently, but thats only
> because I wanted to focus on the IMR and get the POC out of the door.
> 
> It should be possible to get it transparent via the bpfilter.ko approach.
> 
> Next steps for the IMR could be addition of binary operations for
> prefixes ("-d 192.168.0.1/24"), its also needed e.g. for tcp flag
> matching (-p tcp --syn in iptables) and so on.
> 
> I'd also be interested in wheter XDP is seen as appropriate
> target hook.  AFAICS the XDP and the nftables ingress hook are similar
> enough to consider just (re)using the XDP hook to jit the nftables ingress
> hook.  The translator could check if the hook is unused, and return
> early if some other program is already attached.
> 
> Comments welcome, especially wrt. IMR concept and what might be
> next step(s) in moving forward.
> 
> The patches are also available via git at
> https://git.breakpoint.cc/cgit/fw/net-next.git/log/?h=bpfilter7 .

Thanks,
Daniel

Re: [RFC,POC] iptables/nftables to epbf/xdp via common intermediate layer

Reply via email to