On 02/01/2019 12:40 AM, Alexei Starovoitov wrote: > Many algorithms need to read and modify several variables atomically. > Until now it was hard to impossible to implement such algorithms in BPF. > Hence introduce support for bpf_spin_lock. > > The api consists of 'struct bpf_spin_lock' that should be placed > inside hash/array/cgroup_local_storage element > and bpf_spin_lock/unlock() helper function. > > Example: > struct hash_elem { > int cnt; > struct bpf_spin_lock lock; > }; > struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key); > if (val) { > bpf_spin_lock(&val->lock); > val->cnt++; > bpf_spin_unlock(&val->lock); > } > > and BPF_F_LOCK flag for lookup/update bpf syscall commands that > allows user space to read/write map elements under lock. > > Together these primitives allow race free access to map elements > from bpf programs and from user space. > > Key restriction: root only. > Key requirement: maps must be annotated with BTF. > > This concept was discussed at Linux Plumbers Conference 2018. > Thank you everyone who participated and helped to iron out details > of api and implementation. > > Patch 1: bpf_spin_lock support in the verifier, BTF, hash, array. > Patch 2: bpf_spin_lock in cgroup local storage. > Patches 3,4,5: tests > Patch 6: BPF_F_LOCK flag to lookup/update > Patches 7,8,9: tests > > v6->v7: > - fixed this_cpu->__this_cpu per Peter's suggestion and added Ack. > - simplified bpf_spin_lock and load/store overlap check in the verifier > as suggested by Andrii > - rebase > > v5->v6: > - adopted arch_spinlock approach suggested by Peter > - switched to spin_lock_irqsave equivalent as the simplest way > to avoid deadlocks in rare case of nested networking progs > (cgroup-bpf prog in preempt_disable vs clsbpf in softirq sharing > the same map with bpf_spin_lock) > bpf_spin_lock is only allowed in networking progs that don't > have arbitrary entry points unlike tracing progs. > - rebase and split test_verifier tests > > v4->v5: > - disallow bpf_spin_lock for tracing progs due to insufficient preemption > checks > - socket filter progs cannot use bpf_spin_lock due to missing preempt_disable > - fix atomic_set_release. Spotted by Peter. > - fixed hash_of_maps > > v3->v4: > - fix BPF_EXIST | BPF_NOEXIST check patch 6. Spotted by Jakub. Thanks! > - rebase > > v2->v3: > - fixed build on ia64 and archs where qspinlock is not supported > - fixed missing lock init during lookup w/o BPF_F_LOCK. Spotted by Martin > > v1->v2: > - addressed several issues spotted by Daniel and Martin in patch 1 > - added test11 to patch 4 as suggested by Daniel > > Alexei Starovoitov (9): > bpf: introduce bpf_spin_lock > bpf: add support for bpf_spin_lock to cgroup local storage > tools/bpf: sync include/uapi/linux/bpf.h > selftests/bpf: add bpf_spin_lock verifier tests > selftests/bpf: add bpf_spin_lock C test > bpf: introduce BPF_F_LOCK flag > tools/bpf: sync uapi/bpf.h > libbpf: introduce bpf_map_lookup_elem_flags() > selftests/bpf: test for BPF_F_LOCK > > include/linux/bpf.h | 39 ++- > include/linux/bpf_verifier.h | 1 + > include/linux/btf.h | 1 + > include/uapi/linux/bpf.h | 8 +- > kernel/Kconfig.locks | 3 + > kernel/bpf/arraymap.c | 23 +- > kernel/bpf/btf.c | 42 +++ > kernel/bpf/core.c | 2 + > kernel/bpf/hashtab.c | 63 +++- > kernel/bpf/helpers.c | 96 +++++ > kernel/bpf/local_storage.c | 16 +- > kernel/bpf/map_in_map.c | 5 + > kernel/bpf/syscall.c | 45 ++- > kernel/bpf/verifier.c | 171 ++++++++- > net/core/filter.c | 16 +- > tools/include/uapi/linux/bpf.h | 8 +- > tools/lib/bpf/bpf.c | 13 + > tools/lib/bpf/bpf.h | 2 + > tools/lib/bpf/libbpf.map | 1 + > tools/testing/selftests/bpf/Makefile | 2 +- > tools/testing/selftests/bpf/bpf_helpers.h | 4 + > tools/testing/selftests/bpf/test_map_lock.c | 66 ++++ > tools/testing/selftests/bpf/test_progs.c | 117 ++++++- > tools/testing/selftests/bpf/test_spin_lock.c | 108 ++++++ > tools/testing/selftests/bpf/test_verifier.c | 104 +++++- > .../selftests/bpf/verifier/spin_lock.c | 331 ++++++++++++++++++ > 26 files changed, 1248 insertions(+), 39 deletions(-) > create mode 100644 tools/testing/selftests/bpf/test_map_lock.c > create mode 100644 tools/testing/selftests/bpf/test_spin_lock.c > create mode 100644 tools/testing/selftests/bpf/verifier/spin_lock.c >
Applied, thanks!