On 06/29/2018 09:25 AM, Daniel Borkmann wrote: > On 06/19/2018 08:00 PM, Tushar Dave wrote: >> Add new eBPF prog type BPF_PROG_TYPE_SOCKET_SG_FILTER which uses the >> existing socket filter infrastructure for bpf program attach and load. >> SOCKET_SG_FILTER eBPF program receives struct scatterlist as bpf context >> contrast to SOCKET_FILTER which deals with struct skb. This is useful >> for kernel entities that don't have skb to represent packet data but >> want to run eBPF socket filter on packet data that is in form of struct >> scatterlist e.g. IB/RDMA >> >> Signed-off-by: Tushar Dave <tushar.n.d...@oracle.com> >> Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com> [...] >> static void __bpf_prog_release(struct bpf_prog *prog) >> { >> - if (prog->type == BPF_PROG_TYPE_SOCKET_FILTER) { >> + if (prog->type == BPF_PROG_TYPE_SOCKET_FILTER || >> + prog->type == BPF_PROG_TYPE_SOCKET_SG_FILTER) { >> bpf_prog_put(prog); >> } else { >> bpf_release_orig_filter(prog); >> @@ -1551,10 +1552,16 @@ int sk_reuseport_attach_filter(struct sock_fprog >> *fprog, struct sock *sk) >> >> static struct bpf_prog *__get_bpf(u32 ufd, struct sock *sk) >> { >> + struct bpf_prog *prog; >> + >> if (sock_flag(sk, SOCK_FILTER_LOCKED)) >> return ERR_PTR(-EPERM); >> >> - return bpf_prog_get_type(ufd, BPF_PROG_TYPE_SOCKET_FILTER); >> + prog = bpf_prog_get_type(ufd, BPF_PROG_TYPE_SOCKET_FILTER); >> + if (IS_ERR(prog)) >> + prog = bpf_prog_get_type(ufd, BPF_PROG_TYPE_SOCKET_SG_FILTER); >> + >> + return prog; >> } > > Hmm, I don't think this works: this now means as unpriviledged I can attach a > new > BPF_PROG_TYPE_SOCKET_SG_FILTER to a non-rds socket e.g. normal tcp/udp > through the > SO_ATTACH_BPF sockopt, where input context is skb instead of sg list and thus > crash > my box?
... probably best to just make a setsockopt specific to rds here, so the two are fully separated. Also worth exploring whether you can reuse as much as possible from the struct sk_msg_buff context and in general the BPF_PROG_TYPE_SK_MSG type that is using this which we already have in sockmap today. At least feels like some of the concepts are a bit similar. For pulling in more payload you have bpf_msg_pull_data() there which I think might be more user-friendly at least in that you have the full payload from start to the 'current' end available and don't need to navigate through individual sg entries back/forth which could perhaps end up being bit painful for users, though I can see that it's a middle ground between some skb_load_bytes()-alike helper that would copy the pieces out of the sg entries vs needing to linearize. What are the requirements here, would it make sense to offer both as an option or is this impractical based on what you've measured?