On 03/15/2018 01:32 PM, Daniel Borkmann wrote: > On 03/12/2018 08:23 PM, John Fastabend wrote: >> A single sendmsg or sendfile system call can contain multiple logical >> messages that a BPF program may want to read and apply a verdict. But, >> without an apply_bytes helper any verdict on the data applies to all >> bytes in the sendmsg/sendfile. Alternatively, a BPF program may only >> care to read the first N bytes of a msg. If the payload is large say >> MB or even GB setting up and calling the BPF program repeatedly for >> all bytes, even though the verdict is already known, creates >> unnecessary overhead. >> >> To allow BPF programs to control how many bytes a given verdict >> applies to we implement a bpf_msg_apply_bytes() helper. When called >> from within a BPF program this sets a counter, internal to the >> BPF infrastructure, that applies the last verdict to the next N >> bytes. If the N is smaller than the current data being processed >> from a sendmsg/sendfile call, the first N bytes will be sent and >> the BPF program will be re-run with start_data pointing to the N+1 >> byte. If N is larger than the current data being processed the >> BPF verdict will be applied to multiple sendmsg/sendfile calls >> until N bytes are consumed. >> >> Note1 if a socket closes with apply_bytes counter non-zero this >> is not a problem because data is not being buffered for N bytes >> and is sent as its received. >> >> Note2 if this is operating in the sendpage context the data >> pointers may be zeroed after this call if the apply walks beyond >> a msg_pull_data() call specified data range. (helper implemented >> shortly in this series). >> >> Signed-off-by: John Fastabend <john.fastab...@gmail.com> >> --- >> include/uapi/linux/bpf.h | 3 ++- >> net/core/filter.c | 16 ++++++++++++++++ >> 2 files changed, 18 insertions(+), 1 deletion(-) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index b8275f0..e50c61f 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -769,7 +769,8 @@ enum bpf_attach_type { >> FN(getsockopt), \ >> FN(override_return), \ >> FN(sock_ops_cb_flags_set), \ >> - FN(msg_redirect_map), >> + FN(msg_redirect_map), \ >> + FN(msg_apply_bytes), >> >> /* integer value in 'imm' field of BPF_CALL instruction selects which helper >> * function eBPF program intends to call >> diff --git a/net/core/filter.c b/net/core/filter.c >> index 314c311..df2a8f4 100644 >> --- a/net/core/filter.c >> +++ b/net/core/filter.c >> @@ -1928,6 +1928,20 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff >> *msg) >> .arg4_type = ARG_ANYTHING, >> }; >> >> +BPF_CALL_2(bpf_msg_apply_bytes, struct sk_msg_buff *, msg, u64, bytes) >> +{ >> + msg->apply_bytes = bytes; > > Here in bpf_msg_apply_bytes() but also in bpf_msg_cork_bytes() the signature > is u64, but in struct sk_msg_buff and struct smap_psock it's type int, so > user provided u64 will make these negative. Is there a reason to have this > allow a negative value and not being of type u32 everywhere? >
Nope no reason for negative values, we can make it consistently u32.