On 11/2/18 8:42 AM, Edward Cree wrote: > On 02/11/18 15:02, Arnaldo Carvalho de Melo wrote: >> Yeah, didn't work as well: > >> And the -vv in 'perf trace' didn't seem to map to further details in the >> output of the verifier debug: > Yeah for log_level 2 you probably need to make source-level changes to either > perf or libbpf (I think the latter). It's annoying that essentially no > tools > plumb through an option for that, someone should fix them ;-) > >> libbpf: -- BEGIN DUMP LOG --- >> libbpf: >> 0: (bf) r6 = r1 >> 1: (bf) r1 = r10 >> 2: (07) r1 += -328 >> 3: (b7) r7 = 64 >> 4: (b7) r2 = 64 >> 5: (bf) r3 = r6 >> 6: (85) call bpf_probe_read#4 >> 7: (79) r1 = *(u64 *)(r10 -320) >> 8: (15) if r1 == 0x101 goto pc+4 >> R0=inv(id=0) R1=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 >> 9: (55) if r1 != 0x2 goto pc+22 >> R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 >> 10: (bf) r1 = r6 >> 11: (07) r1 += 16 >> 12: (05) goto pc+2 >> 15: (79) r3 = *(u64 *)(r1 +0) >> dereference of modified ctx ptr R1 off=16 disallowed > Aha, we at least got a different error message this time. > And indeed llvm has done that optimisation, rather than the more obvious > 11: r3 = *(u64 *)(r1 +16) > because it wants to have lots of reads share a single insn. You may be able > to defeat that optimisation by adding compiler barriers, idk. Maybe someone > with llvm knowledge can figure out how to stop it (ideally, llvm would know > when it's generating for bpf backend and not do that). -O0? ¯\_(ツ)_/¯
The optimization mostly likes below: br1: ... r1 += 16 goto merge br2: ... r1 += 20 goto merge merge: *(u64 *)(r1 + 0) The compiler tries to merge common loads. There is no easy way to stop this compiler optimization without turning off a lot of other optimizations. The easiest way is to add barriers __asm__ __volatile__("": : :"memory") after the ctx memory access to prevent their down stream merging. > Alternatively, your prog looks short enough that maybe you could kick the C > habit and write it directly in eBPF asm, that way no-one is optimising > things > behind your back. (I realise this option won't appeal to everyone ;-) The LLVM supports BPF inline assembly as well. Some examples here https://github.com/llvm-mirror/llvm/blob/master/test/CodeGen/BPF/inline_asm.ll You may try it for selective ctx access to work around some compiler optimizations. I personally have not used it yet and actually not sure whether it actually works or not :-) > The reason the verifier disallows this, iirc, is because it needs to be able > to rewrite the offsets on ctx accesses (see convert_ctx_accesses()) in case > > underlying kernel struct doesn't match the layout of the ctx ABI. To do this > it needs the ctx offset to live entirely in the insn doing the access, > otherwise different paths could lead to the same insn accessing different > ctx > offsets with different fixups needed — can't be done. > > -Ed >