Dear Justin, Quad-loops are generally not effective for table-lookup-intensive tasks. At a certain point, gcc runs out of registers and starts putting hot variables onto the stack. I've converted a number of dual loops into quad loops, only to discover that they're no faster than the dual loop version.
Rather than having the sample plugin propagate a bunch of "fetch me a rock" coding work, I went with a dual-single loop. When doing new development, I shut off the dual loop, make the single loop work, then build the dual (or quad) loop. With experience, building a dual (or quad) loop becomes a mechanical exercise easily done during a boring meeting. (😉)... In viable quad-loop use-cases, it's not worth any performance to also provide a dual loop. The dual-loop code will run at most one time; there's no chance of fixed overhead amortization. Thanks… Dave -----Original Message----- From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On Behalf Of Justin Iurman Sent: Monday, November 13, 2017 5:51 AM To: vpp-dev <vpp-dev@lists.fd.io> Subject: [vpp-dev] vlib_validate_buffer_enqueue Hey guys, In buffer_node.h, there are the following macros: - vlib_validate_buffer_enqueue_x1 - vlib_validate_buffer_enqueue_x2 - vlib_validate_buffer_enqueue_x4 In a node, I was just wondering what was the use idea behind that ? Is it for a reason of speed ? I mean, you're obviously faster if you process 4 packets horizontally than one after the other. Why then, in the sample plugin, is the "x4" version not used ? A "perfect" plugin would use each of them to cover each case, right ? Also, why not having a "x8" (or more) version ? I guess it's either for a performance issue or to stop at a specific ceiling. Thanks ! Justin _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev