Currently, the destination register is marked as unknown for 32-bit sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is SCALAR_VALUE.
This is too conservative that some valid cases will be rejected. Especially, this may turn a constant scalar value into unknown value that some runtime optimization could fail. For example, there is the following insn pattern for test_l4lb_noinline.c when it is compiled by -mattr=+alu32: return from callee: 411: (b4) (u32) r7 = (u32) 0 412: (54) (u32) r7 &= (u32) 1 413: (bc) (u32) r0 = (u32) r7 414: (95) exit to caller: 202: (bc) (u32) r1 = (u32) r0 203: (b4) (u32) r0 = (u32) 2 204: (bf) r7 = r9 205: (15) if r1 == 0x0 goto pc+69 206: (79) r9 = *(u64 *)(r10 -80) 207: (71) r1 = *(u8 *)(r9 +16) The ending sequence in callee is a sub-register move insn which should make r0 be SCALAR_VALUE 0, however verifier is conservatively marking r0 as unknown, so r0 is not a constant anymore, instead it is: R0=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) Then, after returned to caller, r0 is copied back to r1 inside caller where r1 is used to compare against constant 0. In the C code logic, the stack slot (r10 - 80) will only be stored a valid pointer when r1 if not zero for which case insn 207 will be a valid load. Otherwise no initialize of the stack slot and insn 207 is invalid. Now the code pattern above belongs to the path that is initializing r1 to zero, therefore this path won't initialize the stack slot, but verifier will smartly skip analyzing the fall through path starting at insn 206 because r1 is expected to be zero that the comparison at insn 205 will be true. However due to r0 is conservatively marked as unknown in first place in callee's ending sequence, we have lost the information of r0/r1. The consequence is verifier will fail to skip the fall through path, and will de-reference stack slot at (r10 - 80) which is not with a valid spilled pointer for this path. This patch relaxed the code marking sub-register move destination. For a SCALAR_VALUE or pointer value which is allowed to be leaked as scalar value, it is safe to just copy the value from source, force the value type into SCALAR_VALUE and then truncate it into 32-bit. A unit test also included to demonstrate this issue. This test will fail before this patch. As this relaxation is potentially giving verifier more accurate information, bpf c program with reasonable size have been benchmarked, including those inside kernel bpf selftests and those inside cilium repo. There is NO processed instruction number regression, either with or without -mattr=+alu32. And there are some improvements: - test_xdp_noinline now passed verification under -mattr=+alu32. - a few cililum bpf programs got processed insn number reduced. Insn processed before/after this patch: default -mattr=+alu32 Kernel selftest === test_xdp.o 371/371 369/369 test_l4lb.o 6345/6345 5623/5623 test_xdp_noinline.o 2971/2971 rejected/2727 test_tcp_estates.o 429/429 430/430 Cilium bpf === bpf_lb-DLB_L3.o 2110/2110 1730/1733 bpf_lb-DLB_L4.o 2251/2251 2037/2029 bpf_lb-DUNKNOWN.o 701/701 729/622 bpf_lxc.o 96023/95033 N/A bpf_netdev.o 7456/7245 N/A bpf_overlay.o 3096/2898 3085/2947 bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by verifier because of another issue inside verifier on supporting alu32 binary. One Cilium bpf program could generate several processed insn info, above number is sum of them. Reviewed-by: Jakub Kicinski <jakub.kicin...@netronome.com> Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/verifier.c | 32 ++++++++++++++--------------- tools/testing/selftests/bpf/test_verifier.c | 13 ++++++++++++ 2 files changed, 29 insertions(+), 16 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9584438..ce8a1c3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3583,23 +3583,23 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return err; if (BPF_SRC(insn->code) == BPF_X) { - if (BPF_CLASS(insn->code) == BPF_ALU64) { - /* case: R1 = R2 - * copy register state to dest reg - */ - regs[insn->dst_reg] = regs[insn->src_reg]; - regs[insn->dst_reg].live |= REG_LIVE_WRITTEN; - } else { - /* R1 = (u32) R2 */ - if (is_pointer_value(env, insn->src_reg)) { - verbose(env, - "R%d partial copy of pointer\n", - insn->src_reg); - return -EACCES; - } - mark_reg_unknown(env, regs, insn->dst_reg); - coerce_reg_to_size(®s[insn->dst_reg], 4); + u8 dst_reg, src_reg = insn->src_reg; + + /* Reject partial pointer copy on R1 = (u32) R2. */ + if (BPF_CLASS(insn->code) == BPF_ALU && + is_pointer_value(env, src_reg)) { + verbose(env, "R%d partial copy of pointer\n", + src_reg); + return -EACCES; + } + dst_reg = insn->dst_reg; + regs[dst_reg] = regs[src_reg]; + if (BPF_CLASS(insn->code) == BPF_ALU) { + /* Update type and range info. */ + regs[dst_reg].type = SCALAR_VALUE; + coerce_reg_to_size(®s[dst_reg], 4); } + regs[dst_reg].live |= REG_LIVE_WRITTEN; } else { /* case: R = imm * remember the value we stored into this reg diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 17021d2..18d0b7f 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -2936,6 +2936,19 @@ static struct bpf_test tests[] = { .result = ACCEPT, }, { + "alu32: mov u32 const", + .insns = { + BPF_MOV32_IMM(BPF_REG_7, 0), + BPF_ALU32_IMM(BPF_AND, BPF_REG_7, 1), + BPF_MOV32_REG(BPF_REG_0, BPF_REG_7), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = 0, + }, + { "unpriv: partial copy of pointer", .insns = { BPF_MOV32_REG(BPF_REG_1, BPF_REG_10), -- 2.7.4