On Tue, 3 Jun 2025 at 23:04, Luis Gerhorst <luis.gerho...@fau.de> wrote: > > This improves the expressiveness of unprivileged BPF by inserting > speculation barriers instead of rejecting the programs. > > The approach was previously presented at LPC'24 [1] and RAID'24 [2]. > > To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects > potentially-dangerous unprivileged BPF programs as of > commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted > branches"). In [2], we have analyzed 364 object files from open source > projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf > Examples, Parca, and Prevail) and found that this affects 31% to 54% of > programs. > > To resolve this in the majority of cases this patchset adds a fall-back > for mitigating Spectre v1 using speculation barriers. The kernel still > optimistically attempts to verify all speculative paths but uses > speculation barriers against v1 when unsafe behavior is detected. This > allows for more programs to be accepted without disabling the BPF > Spectre mitigations (e.g., by setting cpu_mitigations_off()). > > For this, it relies on the fact that speculation barriers generally > prevent all later instructions from executing if the speculation was not > correct (not only loads). See patch 7 ("bpf: Fall back to nospec for > Spectre v1") for a detailed description and references to the relevant > vendor documentation (AMD and Intel x86-64, ARM64, and PowerPC). > > In [1] we have measured the overhead of this approach relative to having > mitigations off and including the upstream Spectre v4 mitigations. For > event tracing and stack-sampling profilers, we found that mitigations > increase BPF program execution time by 0% to 62%. For the Loxilb network > load balancer, we have measured a 14% slowdown in SCTP performance but > no significant slowdown for TCP. This overhead only applies to programs > that were previously rejected. > > I reran the expressiveness-evaluation with v6.14 and made sure the main > results still match those from [1] and [2] (which used v6.5). > > Main design decisions are: > > * Do not use separate bytecode insns for v1 and v4 barriers (inspired by > Daniel Borkmann's question at LPC). This simplifies the verifier > significantly and has the only downside that performance on PowerPC is > not as high as it could be. > > * Allow archs to still disable v1/v4 mitigations separately by setting > bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can > benefit from improved BPF expressiveness / performance if they are not > vulnerable (e.g., ARM64 for v4 in the kernel). > > * Do not remove the empty BPF_NOSPEC implementation for backends for > which it is unknown whether they are vulnerable to Spectre v1. > > [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating > Spectre-PHT using Speculation Barriers in Linux eBPF") > [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and > Precise Spectre Defenses for Untrusted Linux Kernel Extensions") > > Changes: > > * v3 -> v4: > - Remove insn parameter from do_check_insn() and extract > process_bpf_exit_full as a function as requested by Eduard > - Investigate apparent sanitize_check_bounds() bug reported by > Kartikeya (does appear to not be a bug but only confusing code), > sent separate patch to document it and add an assert > - Remove already-merged commit 1 ("selftests/bpf: Fix caps for > __xlated/jited_unpriv") > - Drop former commit 10 ("bpf: Allow nospec-protected var-offset stack > access") as it did not include a test and there are other places > where var-off is rejected. Also, none of the tested real-world > programs used var-off in the paper. Therefore keep the old behavior > for now and potentially prepare a patch that converts all cases > later if required. > - Add link to AMD lfence and PowerPC speculation barrier (ori 31,31,0) > documentation > - Move detailed barrier documentation to commit 7 ("bpf: Fall back to > nospec for Spectre v1") > - Link to v3: > https://lore.kernel.org/all/20250501073603.1402960-1-luis.gerho...@fau.de/ >
LGTM. For the set, Acked-by: Kumar Kartikeya Dwivedi <mem...@gmail.com> > [...] >