Den tis 29 jan. 2019 kl 12:17 skrev Daniel Borkmann <dan...@iogearbox.net>: > > On 01/29/2019 10:57 AM, bjorn.to...@gmail.com wrote: > > From: Björn Töpel <bjorn.to...@intel.com> > > > > GCC will generate jump tables for switch-statements with more than 5 > > case statements. An entry into the jump table is an indirect call, > > which means that for CONFIG_RETPOLINE builds, this is rather > > expensive. > > > > This commit replaces the switch-statement that acts on the XDP program > > result with an if-clause. > > > > The if-clause was also refactored into a common function that can be > > used by AF_XDP zero-copy and non-zero-copy code. > > > > Performance prior this patch: > > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > > XDP stats CPU pps issue-pps > > XDP-RX CPU 20 18983018 0 > > XDP-RX CPU total 18983018 > > > > RXQ stats RXQ:CPU pps issue-pps > > rx_queue_index 20:20 18983012 0 > > rx_queue_index 20:sum 18983012 > > > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > > sock0@enp134s0f0:20 rxdrop > > pps pkts 2.00 > > rx 14,641,496 144,751,092 > > tx 0 0 > > > > And after: > > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > > XDP stats CPU pps issue-pps > > XDP-RX CPU 20 24000986 0 > > XDP-RX CPU total 24000986 > > > > RXQ stats RXQ:CPU pps issue-pps > > rx_queue_index 20:20 24000985 0 > > rx_queue_index 20:sum 24000985 > > > > +26% > > > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > > sock0@enp134s0f0:20 rxdrop > > pps pkts 2.00 > > rx 17,623,578 163,503,263 > > tx 0 0 > > > > +20% > > > > Signed-off-by: Björn Töpel <bjorn.to...@intel.com> > > Looks good. Given the performance improvements, wondering in general whether > it would make sense to raise the default limit for generating jump tables if > we have CONFIG_RETPOLINE enabled; as in: > > diff --git a/arch/x86/Makefile b/arch/x86/Makefile > index 9c5a67d..33495a9 100644 > --- a/arch/x86/Makefile > +++ b/arch/x86/Makefile > @@ -217,6 +217,8 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables > # Avoid indirect branches in kernel to deal with Spectre > ifdef CONFIG_RETPOLINE > KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) > + # Avoid generating slow indirect jumps for small number of switch cases > + KBUILD_CFLAGS += --param case-values-threshold=12
Yes, it might make sense to raise it. All XDP capable drivers use a switch to act on the action. The default GCC for x86-64 is 5; I'm curious why you're suggesting 12, I'd pick 17. ;-P Björn > endif > > archscripts: scripts_basic > > That would likely bloat the kernel a bit also in slow-path places where it > would not be needed, but it would generically catch majority of cases. I'll > run some experiments later today (but in any case that should not block this > patch here). > > Cheers, > Daniel