On 10/4/19 12:53 PM, Thomas Monjalon wrote:
04/10/2019 11:54, Steve Capper:
I'd recommend also reaching out the BPF maintainers:
BPF JIT for ARM64
M: Daniel Borkmann <dan...@iogearbox.net>
M: Alexei Starovoitov <a...@kernel.org>
M: Zi Shen Lim <zlim....@gmail.com>
L: net...@vger.kernel.org
L: b...@vger.kernel.org
S: Supported
F: arch/arm64/net/
As they will have much better knowledge of the state of play and will be
better able to advise.
As far as I know Alexei and Daniel are OK with the idea.
But better to let them reply here.
I suggest we think about a way to package the kernel BPF JIT
for userspace usage (not only DPDK) as a library.
I don't understand why the DPDK JIT should be different
or optimized differently.
That would be great indeed as both projects would benefit from a shared
JIT instead of reimplementing everything twice. I never looked into DPDK
too much, but I presume the idea would be as well to take the LLVM (or
bpf-gcc) generated object file and load it into a BPF 'engine' that sits
in user space on top of DPDK? Presumably loader could be libbpf here as
well since it already knows how to parse the ELF, perform the relocations
etc. The only difference would be that you have a different context and
different helpers? Is that the goal eventually?
The only real issue I see is the need for a dual licensing BSD-GPL.
This might be one avenue if all kernel JIT contributors would be on board.
Another option I'm wondering could be to extend the bpf() syscall in order
to pass down a description of context and helper mappings e.g. via BTF and
let everything go through the verifier in the kernel the usual way (I presume
one goal might be that you want to assure that the generated BPF code passes
the safety checks before running the prog), then have it JITed and extract
the generated image in order to use it from user space. Kernel would have
to make sure it never actually allows attaching this program in the kernel.
Generated opcodes can already be retrieved today (see below). Such infra
could potentially help bpf-gcc folks as well as they expressed desire to
have some sort of a simulator for their gcc BPF test suite.. and it would
allow for consistent behavior of the BPF runtime. Just a thought.
# bpftool prog
2: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-10-03T12:53:11+0200 uid 0
xlated 296B jited 229B memlock 4096B map_ids 2,3
[...]
# bpftool prog dump xlated id 2
0: (bf) r6 = r1
1: (69) r7 = *(u16 *)(r6 +192)
2: (b4) w8 = 0
3: (55) if r7 != 0x8 goto pc+14
4: (bf) r1 = r6
5: (b4) w2 = 16
6: (bf) r3 = r10
7: (07) r3 += -4
8: (b4) w4 = 4
9: (85) call bpf_skb_load_bytes#7484768
10: (18) r1 = map[id:2]
12: (bf) r2 = r10
13: (07) r2 += -8
14: (62) *(u32 *)(r2 +0) = 32
15: (85) call trie_lookup_elem#90800
16: (15) if r0 == 0x0 goto pc+1
17: (44) w8 |= 2
18: (55) if r7 != 0xdd86 goto pc+14
19: (bf) r1 = r6
20: (b4) w2 = 24
21: (bf) r3 = r10
22: (07) r3 += -16
23: (b4) w4 = 16
24: (85) call bpf_skb_load_bytes#7484768
25: (18) r1 = map[id:3]
27: (bf) r2 = r10
28: (07) r2 += -20
29: (62) *(u32 *)(r2 +0) = 128
30: (85) call trie_lookup_elem#90800
31: (15) if r0 == 0x0 goto pc+1
32: (44) w8 |= 2
33: (b7) r0 = 1
34: (55) if r8 != 0x2 goto pc+1
35: (b7) r0 = 0
36: (95) exit
# bpftool prog dump jited id 2 opcodes
0: push %rbp
55
1: mov %rsp,%rbp
48 89 e5
4: sub $0x40,%rsp
48 81 ec 40 00 00 00
b: sub $0x28,%rbp
48 83 ed 28
f: mov %rbx,0x0(%rbp)
48 89 5d 00
13: mov %r13,0x8(%rbp)
4c 89 6d 08
17: mov %r14,0x10(%rbp)
4c 89 75 10
1b: mov %r15,0x18(%rbp)
4c 89 7d 18
1f: xor %eax,%eax
31 c0
21: mov %rax,0x20(%rbp)
48 89 45 20
25: mov %rdi,%rbx
48 89 fb
28: movzwq 0xc0(%rbx),%r13
4c 0f b7 ab c0 00 00 00
30: xor %r14d,%r14d
45 31 f6
33: cmp $0x8,%r13
49 83 fd 08
37: jne 0x0000000000000079
75 40
39: mov %rbx,%rdi
48 89 df
3c: mov $0x10,%esi
be 10 00 00 00
[...]
cb: jne 0x00000000000000cf
75 02
cd: xor %eax,%eax
31 c0
cf: mov 0x0(%rbp),%rbx
48 8b 5d 00
d3: mov 0x8(%rbp),%r13
4c 8b 6d 08
d7: mov 0x10(%rbp),%r14
4c 8b 75 10
db: mov 0x18(%rbp),%r15
4c 8b 7d 18
df: add $0x28,%rbp
48 83 c5 28
e3: leaveq
c9
e4: retq
c3
Thanks,
Daniel