On Wed, Dec 19, 2018 at 05:44:07PM -0500, Jiong Wang wrote: > Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32 > instructions. > > However, there is no JMP32 instructions, the consequence is code-gen for > 32-bit sub-registers is not efficient. For example, explicit sign-extension > from 32-bit to 64-bit is needed for signed comparison. > > Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit > sub-register support. This also match those JMP32 instructions in most JIT > backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions > could have one-to-one map on them. > > A few verifier ALU32 related bugs has been fixed recently, and JMP32 > introduced by this set further improves BPF sub-register ecosystem. Once > this is landed, BPF programs using 32-bit sub-register ISA could get > reasonably good support from verifier and JIT compilers. Users then could > compare the runtime efficiency of one BPF program under both modes, and > could use the one benchmarked as better. One good thing is JMP32 is making > 32-bit JIT more efficient, because it only has 32-bit use, no def, so > unlike ALU32, no need to clear high bits. Hence, even without data-flow > analysis, JMP32 is making better code-gen then JMP64. More benchmark > results are listed below in this cover letter. > > - Encoding > > Ideally, JMP32 could use new CLASS BPF_JMP32, just like BPF_ALU and > BPF_ALU32. But we only has one class number 0x06 unused. I am not sure > if we want to keep it for other extension purpose. For example restore > it as BPF_MISC which could then redefine the interpretation of all the > remaining bits in bis[7:1]; > > So, I am following the coding style used by BPF_PSEUDO_CALL, that is to > use reserved bits under BPF_JMP. When BPF_SRC(code) == BPF_X, the > encoding is 0x1 at insn->imm. When BPF_SRC(code) == BPF_K, the encoding > is 0x1 at insn->src_reg. All other bits in imm and src_reg are still > reserved and should be zeroed.
this choice of encoding penalizes interpreter a lot, since every jmp (both 64 and 32-bit) become multiple conditional branches. I suspect interpreter performance suffers a lot. We can still use such encoding for uapi and recode to 256 opcodes, but why jump the hoops when class 6 is still unused? Just use it for BPF_JMP32. It will also help avoid issues with JITs that rely on opcode to do conversion. Like if we don't convert all JITs with the proposed encoding unconverted JITs will generate 64-bit jmp for 32-bit one, since they didn't check insn->imm or src_reg.