[re-sending w/o extra email addresses] Hi vpp-dev,
So I'm writing a plugin to support IPTFS (https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01), and during this process I've written a macro to support prefetching the second half of N (where N is power of 2) buffers and acting on the first half (i.e., the standard pattern of prefetch code in vpp). The macro allows one to simply write the "single buffer" case code, and not have to have copies of the code for each power of 2 to be supported. One slight variation on standard VPP style is that the macro and code use beginning and end pointers for buffer index arrays instead of an array index and array length. I'm sending this mail b/c it may be a little while before I can submit my code, and the arm guys are currently adding prefetch code so I thought this could be useful for that. Also I might get some useful feedback on the code as well. :) Here are the prefetch macros: static inline int half_range (vlib_buffer_t **start, vlib_buffer_t **end, int limit) { u32 range = (end - start) / 2; return range < limit ? range : limit; } #define FOREACH_PREFETCH(b, eb) \ do \ { \ vlib_buffer_t **bp, **ebp; \ for (u32 half = half_range ((b), (eb), CLIB_N_PREFETCHES); (b) < (eb); \ half = half_range ((b), (eb), CLIB_N_PREFETCHES)) \ { \ \ /* prefetch second half */ \ if (half) \ for (bp = (b) + half, ebp = bp + half; bp < ebp; bp++) \ vlib_prefetch_buffer_header (*bp, LOAD); \ else \ half = 1; /* Process at least one next */ \ \ /* process first half or last one */ \ for (ebp = (b) + half; (b) < ebp && to < eto; (b)++) #define END_FOREACH_PREFETCH \ } \ } \ while (0) Here are the macros I use to handle filling up next frames in parallel and working with arrays using end pointers vs. indices and lengths. /* in node_funcs.h */ #define vlib_get_next_frame_macro_p(vm,node,next_index,vectors,evectors,alloc_new_frame) \ do { \ vlib_frame_t * _f \ = vlib_get_next_frame_internal ((vm), (node), (next_index), \ (alloc_new_frame)); \ u32 _n = _f->n_vectors; \ (vectors) = vlib_frame_vector_args (_f) + _n * sizeof ((vectors)[0]);\ (evectors) = (vectors) + (VLIB_FRAME_SIZE - _n); \ } while (0) #define vlib_get_next_frame_p(vm,node,next_index,vectors,evectors) \ vlib_get_next_frame_macro_p (vm, node, next_index, \ vectors, evectors, \ /* alloc new frame */ 0) /* in iptfs code.. */ #define vlib_put_get_next_frame(vm, node, ni, to, eto) \ do \ { \ /* Put the frame if it is full */ \ if ((to) && (eto) != (to)) \ ; \ else \ { \ if ((to)) \ vlib_put_next_frame ((vm), (node), (ni), (eto) - (to)); \ vlib_get_next_frame_p ((vm), (node), (ni), (to), (eto)); \ } \ } \ while (0) #define vlib_put_get_next_frame_a(vm, node, ni, toa, etoa) \ vlib_put_get_next_frame (vm, node, ni, (toa)[(ni)], (etoa)[(ni)]) Finally here's the payoff, a slightly trimmed down use of the above using only a single copy of the functional code. In this case it's going to start with prefetching 8 buffers and handling 8, then 4, then 2, then 1, then none (it starts with 8 b/c CLIB_N_PREFETCHES is 16, but any power of 2 would work). Anyway the normal pattern in VPP would require 5 copies of the code in this case instead of the single copy used here. iptfs_encap_node_inline (vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame, int is_tun) { vlib_buffer_t *bufs[VLIB_FRAME_SIZE]; vlib_buffer_t **b = bufs, **eb = bufs + frame->n_vectors; vlib_get_buffers (vm, vlib_frame_vector_args (frame), bufs, frame->n_vectors); u32 *to[IPTFS_ENCAP_N_NEXT] = {}; u32 *eto[IPTFS_ENCAP_N_NEXT] = {}; while (b < eb) { FOREACH_PREFETCH (b, eb) { /* * Handle single buffer found in (*b) */ u8 *data = vlib_buffer_get_current (*b); /* ... do stuff with packet in buffer (*b) ... */ if (send to SOME_NEXT_NODE condition) { ni = SOME_NEXT_NODE_INDEX; vlib_put_get_next_frame_a (vm, node, ni, to, eto); *to[ni]++ = vlib_get_buffer_index (vm, *b); continue; } if (drop condition) { dropit: vlib_put_get_next_frame_a (vm, node, IPTFS_ENCAP_NEXT_DROP, to, eto); *to[IPTFS_ENCAP_NEXT_DROP]++ = vlib_get_buffer_index (vm, *b); (*b)->error = node->errors[IPTFS_ENCAP_ERROR_Q_FULL]; continue; } } END_FOREACH_PREFETCH; } /* Put any next frames we've used */ for (uint i = 0; i < IPTFS_ENCAP_N_NEXT; i++) if (to[i]) vlib_put_next_frame (vm, node, i, eto[i] - to[i]); return frame->n_vectors; } Thanks, Chris.
signature.asc
Description: PGP signature
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#13497): https://lists.fd.io/g/vpp-dev/message/13497 Mute This Topic: https://lists.fd.io/mt/32438895/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-