> > >> > > From: dev <dev-boun...@dpdk.org> On Behalf Of Ciara Power When > > >> > > choosing a vector path to take, an extra condition must be > > >> > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled > > >path. > > >> > > > > >> > > The vector path was initially chosen in RTE_INIT, however this is > > >> > > no longer suitable as we cannot check the max SIMD bitwidth at that > > >time. > > >> > > The default chosen in RTE_INIT is now scalar. For best performance > > >> > > and to use vector paths, apps must explicitly call the set > > >> > > algorithm function before using other functions from this library, > > >> > > as this is where vector handlers are now chosen. > > >> > > > >> > [DC] Has it been decided that it is ok to now require applications > > >> > to pick the CRC algorithm they want to use? > > >> > > > >> > An application which previously automatically got SSE4.2 CRC, for > > >> > example, will now automatically only get scalar. > > >> > > > >> > If this is ok, this should probably be called out explicitly in > > >> > release notes as it may not be Immediately noticeable to users that > > >> > they now need to select the CRC algo. > > >> > > > >> > Actually, in general, the release notes need to be updated for this > > >> patchset. > > >> > > >> The decision to move rte_set_alg() out of RTE_INIT was taken to avoid > > >> check on max_simd_bitwidth in data path for every single time when > > >> crc_calc() api is invoked. Based on my understanding, > > >> max_simd_bitwidth is set after eal init, and when used in crc_calc(), > > >> it might override the default crc algo set during RTE_INIT. Therefore, > > >> to avoid extra check on max_simd_bitwidth in data path, better option > > >> will be to use this static configuration one time after eal init in the > > >> set_algo > > >API. > > > > > >[DC] Yes that is a good change to have made to avoid extra datapath checks. > > > > > >Based on off-list discussion, I now also know the reason behind now > > >defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was chosen > > >by > > >RTE_INIT (e.g. > > >SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD (64) > > >through the EAL parameter or call to rte_set_max_simd_bitwidth(), then > > >there is a mismatch if rte_net_crc_set_alg() is not then called to > > >reconfigure > > >the CRC. Defaulting to scalar avoids this mismatch and works on all archs > > > > > >As I mentioned before, I think this needs to be called out in release > > >notes, as > > >it's an under-the-hood change which could cause app performance to drop if > > >app developers aren't aware of it - the API itself hasn't changed, so they > > >may > > >not read the doxygen :) > > > > > > > Yes that is a good point, I can add to the release notes for this to call > > it out. > > I don't think it is a good idea to have the scalar crc by default. > To me, the fastest available CRC has to be enabled by default. > > I understand the technical reason why you did it like this however: the > SIMD bitwidth may not be known at the time the > RTE_INIT(rte_net_crc_init) function is called. > > A simple approach to solve this issue would be to initialize the > rte_net_crc_handler pointer to a handlers_default. The first time a crc > is called, the rte_crc32_*_default_handler() function would check the > configured SIMD bitwidth, and set the handler to the correct one, to > avoid to do the test for next time. > > This approach still does not solve the case where the SIMD bitwidth is > modified during the life of the application. In this case, a callback > would have to be registered to notify SIMD bitwidth changes... but I > don't think it is worth to do it. Instead, it can be documented that > rte_set_max_simd_bitwidth() has to be called early, before > rte_eal_init().
Actually I also thought about callback approach. It does complicate things a bit for sure, but on a positive side - it allows to solve RTE_INIT() code-path selection problem in a generic way, plus it means zero changes in the data-path. So probably worth to consider it. > > > > > >> > > >> > > >> > > > > >> > > Suggested-by: Jasvinder Singh <jasvinder.si...@intel.com> > > >> > > > > >> > > Signed-off-by: Ciara Power <ciara.po...@intel.com> > > >> > > > > >> > > --- > > >> > > v3: > > >> > > - Moved choosing vector paths out of RTE_INIT. > > >> > > - Moved checking max_simd_bitwidth into the set_alg function. > > >> > > --- > > >> > > lib/librte_net/rte_net_crc.c | 26 +++++++++++++++++--------- > > >> > > lib/librte_net/rte_net_crc.h | 3 ++- > > >> > > 2 files changed, 19 insertions(+), 10 deletions(-) > > >> > > > > >> > > diff --git a/lib/librte_net/rte_net_crc.c > > >> > > b/lib/librte_net/rte_net_crc.c index > > >> > > 9fd4794a9d..241eb16399 100644 > > >> > > --- a/lib/librte_net/rte_net_crc.c > > >> > > +++ b/lib/librte_net/rte_net_crc.c > > >> > > > >> > <snip> > > >> > > > >> > > @@ -145,18 +149,26 @@ rte_crc32_eth_handler(const uint8_t *data, > > >> > > uint32_t data_len) void rte_net_crc_set_alg(enum rte_net_crc_alg > > >> > > alg) { > > >> > > + if (max_simd_bitwidth == 0) > > >> > > + max_simd_bitwidth = rte_get_max_simd_bitwidth(); > > >> > > + > > >> > > switch (alg) { > > >> > > #ifdef X86_64_SSE42_PCLMULQDQ > > >> > > case RTE_NET_CRC_SSE42: > > >> > > - handlers = handlers_sse42; > > >> > > - break; > > >> > > + if (max_simd_bitwidth >= RTE_MAX_128_SIMD) { > > >> > > + handlers = handlers_sse42; > > >> > > + return; > > >> > > + } > > >> > > + RTE_LOG(INFO, NET, "Max SIMD Bitwidth too low, using > > >> > > scalar\n"); > > >> > > > >> > [DC] Not sure if you're aware but there is another patchset which > > >> > adds an > > >> > AVX512 CRC implementation and run-time checking of cpuflags to > > >> > select the CRC path to use: > > >> > https://patchwork.dpdk.org/project/dpdk/list/?series=12596 > > >> > > > >> > There will be a task to merge these 2 patchsets if both are merged. > > >> > It looks fairly straightforward to me to merge these, but it would > > >> > be good if you take a look too > > > > I have looked at that patchset, I agree, I think they will be > > straightforward to merge together. > > > > Thanks, > > Ciara