Hi David, Thanks for reviewing,
>-----Original Message----- >From: Coyle, David <david.co...@intel.com> >Sent: Thursday 1 October 2020 15:17 >To: Singh, Jasvinder <jasvinder.si...@intel.com>; Power, Ciara ><ciara.po...@intel.com>; dev@dpdk.org >Cc: Power, Ciara <ciara.po...@intel.com>; Olivier Matz ><olivier.m...@6wind.com>; O'loingsigh, Mairtin ><mairtin.oloings...@intel.com>; Ryan, Brendan <brendan.r...@intel.com>; >Richardson, Bruce <bruce.richard...@intel.com> >Subject: RE: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD >bitwidth > >Hi Jasvinder/Ciara > >> -----Original Message----- >> From: Singh, Jasvinder <jasvinder.si...@intel.com> >> > >> > > From: dev <dev-boun...@dpdk.org> On Behalf Of Ciara Power When >> > > choosing a vector path to take, an extra condition must be >> > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled >path. >> > > >> > > The vector path was initially chosen in RTE_INIT, however this is >> > > no longer suitable as we cannot check the max SIMD bitwidth at that >time. >> > > The default chosen in RTE_INIT is now scalar. For best performance >> > > and to use vector paths, apps must explicitly call the set >> > > algorithm function before using other functions from this library, >> > > as this is where vector handlers are now chosen. >> > >> > [DC] Has it been decided that it is ok to now require applications >> > to pick the CRC algorithm they want to use? >> > >> > An application which previously automatically got SSE4.2 CRC, for >> > example, will now automatically only get scalar. >> > >> > If this is ok, this should probably be called out explicitly in >> > release notes as it may not be Immediately noticeable to users that >> > they now need to select the CRC algo. >> > >> > Actually, in general, the release notes need to be updated for this >> patchset. >> >> The decision to move rte_set_alg() out of RTE_INIT was taken to avoid >> check on max_simd_bitwidth in data path for every single time when >> crc_calc() api is invoked. Based on my understanding, >> max_simd_bitwidth is set after eal init, and when used in crc_calc(), >> it might override the default crc algo set during RTE_INIT. Therefore, >> to avoid extra check on max_simd_bitwidth in data path, better option >> will be to use this static configuration one time after eal init in the >> set_algo >API. > >[DC] Yes that is a good change to have made to avoid extra datapath checks. > >Based on off-list discussion, I now also know the reason behind now >defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was chosen by >RTE_INIT (e.g. >SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD (64) >through the EAL parameter or call to rte_set_max_simd_bitwidth(), then >there is a mismatch if rte_net_crc_set_alg() is not then called to reconfigure >the CRC. Defaulting to scalar avoids this mismatch and works on all archs > >As I mentioned before, I think this needs to be called out in release notes, as >it's an under-the-hood change which could cause app performance to drop if >app developers aren't aware of it - the API itself hasn't changed, so they may >not read the doxygen :) > Yes that is a good point, I can add to the release notes for this to call it out. >> >> >> > > >> > > Suggested-by: Jasvinder Singh <jasvinder.si...@intel.com> >> > > >> > > Signed-off-by: Ciara Power <ciara.po...@intel.com> >> > > >> > > --- >> > > v3: >> > > - Moved choosing vector paths out of RTE_INIT. >> > > - Moved checking max_simd_bitwidth into the set_alg function. >> > > --- >> > > lib/librte_net/rte_net_crc.c | 26 +++++++++++++++++--------- >> > > lib/librte_net/rte_net_crc.h | 3 ++- >> > > 2 files changed, 19 insertions(+), 10 deletions(-) >> > > >> > > diff --git a/lib/librte_net/rte_net_crc.c >> > > b/lib/librte_net/rte_net_crc.c index >> > > 9fd4794a9d..241eb16399 100644 >> > > --- a/lib/librte_net/rte_net_crc.c >> > > +++ b/lib/librte_net/rte_net_crc.c >> > >> > <snip> >> > >> > > @@ -145,18 +149,26 @@ rte_crc32_eth_handler(const uint8_t *data, >> > > uint32_t data_len) void rte_net_crc_set_alg(enum rte_net_crc_alg >> > > alg) { >> > > + if (max_simd_bitwidth == 0) >> > > + max_simd_bitwidth = rte_get_max_simd_bitwidth(); >> > > + >> > > switch (alg) { >> > > #ifdef X86_64_SSE42_PCLMULQDQ >> > > case RTE_NET_CRC_SSE42: >> > > - handlers = handlers_sse42; >> > > - break; >> > > + if (max_simd_bitwidth >= RTE_MAX_128_SIMD) { >> > > + handlers = handlers_sse42; >> > > + return; >> > > + } >> > > + RTE_LOG(INFO, NET, "Max SIMD Bitwidth too low, using >> > > scalar\n"); >> > >> > [DC] Not sure if you're aware but there is another patchset which >> > adds an >> > AVX512 CRC implementation and run-time checking of cpuflags to >> > select the CRC path to use: >> > https://patchwork.dpdk.org/project/dpdk/list/?series=12596 >> > >> > There will be a task to merge these 2 patchsets if both are merged. >> > It looks fairly straightforward to me to merge these, but it would >> > be good if you take a look too I have looked at that patchset, I agree, I think they will be straightforward to merge together. Thanks, Ciara