On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
On 11/11/17 6:07 AM, Daniel Borkmann wrote:
On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
From: Vlad Dumitrescu <vla...@google.com>
Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
Signed-off-by: Vlad Dumitrescu <vla...@google.com>
---
include/uapi/linux/bpf.h | 1 +
net/core/filter.c | 11 +++++++++++
tools/include/uapi/linux/bpf.h | 1 +
3 files changed, 13 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e880ae6434ee..9757a2002513 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -947,6 +947,7 @@ struct bpf_sock_ops {
__u32 local_ip6[4]; /* Stored in network byte order */
__u32 remote_port; /* Stored in network byte order */
__u32 local_port; /* stored in host byte order */
+ __u32 priority;
};
/* List of known BPF sock_ops operators.
diff --git a/net/core/filter.c b/net/core/filter.c
index 61c791f9f628..a6329642d047 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
bpf_access_type type,
*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
offsetof(struct sock_common, skc_num));
break;
+
+ case offsetof(struct bpf_sock_ops, priority):
+ BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
+
+ *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
+ struct bpf_sock_ops_kern, sk),
+ si->dst_reg, si->src_reg,
+ offsetof(struct bpf_sock_ops_kern, sk));
+ *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
+ offsetof(struct sock, sk_priority));
+ break;
Hm, I don't think this would work, I actually think your initial patch
was ok.
bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
right
before accessing options on either socket or TCP level, and bail out
with error
otherwise; in such cases we'd read something else here and assume it's
sk_priority.
even if it's not fullsock, it will just read zero, no? what's a problem
with that?
In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
the program author will know that it's meaningless to read sk_priority,
so returning zero with minimal checks is fine.
While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
since the safety is not compromised.
Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
struct request_sock itself is only 232 byte long in total, and the struct
inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
that way, so it cannot be ignored and assumed zero.
If we don't care about error when !fullsock, then you could code the
sk_fullsock(sk) check in BPF itself above in the ctx conversion, and
set it to 0 manually when !fullsock. It might make it harder in the
future to change sk_fullsock() itself, but in any case sk_fullsock()
helper should get a comment in its function saying that when contents
are changed, also above BPF bits need to be adjusted to remain an
equivalent test.