On Tue, 3 Jun 2025 at 23:04, Luis Gerhorst <luis.gerho...@fau.de> wrote:
>
> This improves the expressiveness of unprivileged BPF by inserting
> speculation barriers instead of rejecting the programs.
>
> The approach was previously presented at LPC'24 [1] and RAID'24 [2].
>
> To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects
> potentially-dangerous unprivileged BPF programs as of
> commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted
> branches"). In [2], we have analyzed 364 object files from open source
> projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf
> Examples, Parca, and Prevail) and found that this affects 31% to 54% of
> programs.
>
> To resolve this in the majority of cases this patchset adds a fall-back
> for mitigating Spectre v1 using speculation barriers. The kernel still
> optimistically attempts to verify all speculative paths but uses
> speculation barriers against v1 when unsafe behavior is detected. This
> allows for more programs to be accepted without disabling the BPF
> Spectre mitigations (e.g., by setting cpu_mitigations_off()).
>
> For this, it relies on the fact that speculation barriers generally
> prevent all later instructions from executing if the speculation was not
> correct (not only loads). See patch 7 ("bpf: Fall back to nospec for
> Spectre v1") for a detailed description and references to the relevant
> vendor documentation (AMD and Intel x86-64, ARM64, and PowerPC).
>
> In [1] we have measured the overhead of this approach relative to having
> mitigations off and including the upstream Spectre v4 mitigations. For
> event tracing and stack-sampling profilers, we found that mitigations
> increase BPF program execution time by 0% to 62%. For the Loxilb network
> load balancer, we have measured a 14% slowdown in SCTP performance but
> no significant slowdown for TCP. This overhead only applies to programs
> that were previously rejected.
>
> I reran the expressiveness-evaluation with v6.14 and made sure the main
> results still match those from [1] and [2] (which used v6.5).
>
> Main design decisions are:
>
> * Do not use separate bytecode insns for v1 and v4 barriers (inspired by
>   Daniel Borkmann's question at LPC). This simplifies the verifier
>   significantly and has the only downside that performance on PowerPC is
>   not as high as it could be.
>
> * Allow archs to still disable v1/v4 mitigations separately by setting
>   bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can
>   benefit from improved BPF expressiveness / performance if they are not
>   vulnerable (e.g., ARM64 for v4 in the kernel).
>
> * Do not remove the empty BPF_NOSPEC implementation for backends for
>   which it is unknown whether they are vulnerable to Spectre v1.
>
> [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
>     Spectre-PHT using Speculation Barriers in Linux eBPF")
> [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
>     Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
>
> Changes:
>
> * v3 -> v4:
>   - Remove insn parameter from do_check_insn() and extract
>     process_bpf_exit_full as a function as requested by Eduard
>   - Investigate apparent sanitize_check_bounds() bug reported by
>     Kartikeya (does appear to not be a bug but only confusing code),
>     sent separate patch to document it and add an assert
>   - Remove already-merged commit 1 ("selftests/bpf: Fix caps for
>     __xlated/jited_unpriv")
>   - Drop former commit 10 ("bpf: Allow nospec-protected var-offset stack
>     access") as it did not include a test and there are other places
>     where var-off is rejected. Also, none of the tested real-world
>     programs used var-off in the paper. Therefore keep the old behavior
>     for now and potentially prepare a patch that converts all cases
>     later if required.
>   - Add link to AMD lfence and PowerPC speculation barrier (ori 31,31,0)
>     documentation
>   - Move detailed barrier documentation to commit 7 ("bpf: Fall back to
>     nospec for Spectre v1")
>   - Link to v3: 
> https://lore.kernel.org/all/20250501073603.1402960-1-luis.gerho...@fau.de/
>

LGTM. For the set,

Acked-by: Kumar Kartikeya Dwivedi <mem...@gmail.com>

> [...]
>

Reply via email to