On Wed, Dec 02, 2015 at 05:57:10PM +0100, Thomas Monjalon wrote: > 2015-12-02 22:23, Jerin Jacob: > > On Wed, Dec 02, 2015 at 05:40:13PM +0100, Thomas Monjalon wrote: > > > 2015-12-02 20:04, Jerin Jacob: > > > > On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote: > > > > > On 2 December 2015 at 18:39, Jerin Jacob <jerin.jacob at > > > > > caviumnetworks.com> wrote: > > > > > > AND they include "rte_lpm.h"(it internally includes rte_vect.h) > > > > > > that lead to multiple definition and its not good. > > > > > > > > > > > But you will have similar issue since "typedef int32x4_t __m128i" > > > > > appears in both your patch and this header file. > > > > > > > > I just tested it, it won't break, back to back "typedef int32x4_t > > > > __m128i" > > > > is fine(unlike inline function). > > > > > > > > my intention to keep __m128i "as is" because changing the __m128i to > > > > rte_??? > > > > something would break the ABI. > > > > > > Isn't it already broken in 2.2? > > > > Does it mean, You would like to have rte_128i(or similar) kind of > > abstraction to represent 128bit SIMD variable in DPDK? > > If you are convinced that it is the best way to write a generic code, yes.
I grep-ed through DPDK API list to see the dependency with SIMD in API definition.I see only rte_lpm_lookupx4 API has SIMD dependency in API definition. I believe that's the root cause of the problem. IMO, The better way to fix this would be to remove __m128i from API and have more general representation to remove the architecture dependency from API something like this, rte_lpm_lookupx4(const struct rte_lpm *lpm, uint32_t *ip, uint16_t hop[4], uint16_t defv) instead of rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4], uint16_t defv) Now I am not sure why this API was created like this, from l3fwd.c example, it looks to accommodate the IPV4 byte swap[1]. If it's true, maybe we can have eal byte swap abstraction for optimized byte swap on memory for 4 IP address in one shot or Have rte_lpm_lookupx4 take an argument for byte swap or not ? or something similar? Thoughts ? [1] const __m128i bswap_mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3); /* Byte swap 4 IPV4 addresses. */ dip = _mm_shuffle_epi8(dip, bswap_mask); Jerin > I think the most important question is to know what is the best solution > for performance and maintainability. The API/ABI questions will be considered > after. > > Thanks for your involvement guys.