Hi Konstantin,
>-----Original Message----- >From: Ananyev, Konstantin <konstantin.anan...@intel.com> >Sent: Thursday 8 October 2020 15:55 >To: Olivier Matz <olivier.m...@6wind.com>; Power, Ciara ><ciara.po...@intel.com> >Cc: Coyle, David <david.co...@intel.com>; Singh, Jasvinder ><jasvinder.si...@intel.com>; dev@dpdk.org; O'loingsigh, Mairtin ><mairtin.oloings...@intel.com>; Ryan, Brendan <brendan.r...@intel.com>; >Richardson, Bruce <bruce.richard...@intel.com> >Subject: RE: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD >bitwidth > >> > >> > > From: dev <dev-boun...@dpdk.org> On Behalf Of Ciara Power >> > >> > > When choosing a vector path to take, an extra condition must >> > >> > > be satisfied to ensure the max SIMD bitwidth allows for the >> > >> > > CPU enabled >> > >path. >> > >> > > >> > >> > > The vector path was initially chosen in RTE_INIT, however >> > >> > > this is no longer suitable as we cannot check the max SIMD >> > >> > > bitwidth at that >> > >time. >> > >> > > The default chosen in RTE_INIT is now scalar. For best >> > >> > > performance and to use vector paths, apps must explicitly >> > >> > > call the set algorithm function before using other functions >> > >> > > from this library, as this is where vector handlers are now chosen. >> > >> > >> > >> > [DC] Has it been decided that it is ok to now require >> > >> > applications to pick the CRC algorithm they want to use? >> > >> > >> > >> > An application which previously automatically got SSE4.2 CRC, >> > >> > for example, will now automatically only get scalar. >> > >> > >> > >> > If this is ok, this should probably be called out explicitly in >> > >> > release notes as it may not be Immediately noticeable to users >> > >> > that they now need to select the CRC algo. >> > >> > >> > >> > Actually, in general, the release notes need to be updated for >> > >> > this >> > >> patchset. >> > >> >> > >> The decision to move rte_set_alg() out of RTE_INIT was taken to >> > >> avoid check on max_simd_bitwidth in data path for every single >> > >> time when >> > >> crc_calc() api is invoked. Based on my understanding, >> > >> max_simd_bitwidth is set after eal init, and when used in >> > >> crc_calc(), it might override the default crc algo set during >> > >> RTE_INIT. Therefore, to avoid extra check on max_simd_bitwidth in >> > >> data path, better option will be to use this static >> > >> configuration one time after eal init in the set_algo >> > >API. >> > > >> > >[DC] Yes that is a good change to have made to avoid extra datapath >checks. >> > > >> > >Based on off-list discussion, I now also know the reason behind now >> > >defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was >> > >chosen by RTE_INIT (e.g. >> > >SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD >> > >(64) through the EAL parameter or call to >> > >rte_set_max_simd_bitwidth(), then there is a mismatch if >> > >rte_net_crc_set_alg() is not then called to reconfigure the CRC. >> > >Defaulting to scalar avoids this mismatch and works on all archs >> > > >> > >As I mentioned before, I think this needs to be called out in >> > >release notes, as it's an under-the-hood change which could cause >> > >app performance to drop if app developers aren't aware of it - the >> > >API itself hasn't changed, so they may not read the doxygen :) >> > > >> > >> > Yes that is a good point, I can add to the release notes for this to call >> > it >out. >> >> I don't think it is a good idea to have the scalar crc by default. >> To me, the fastest available CRC has to be enabled by default. >> >> I understand the technical reason why you did it like this however: >> the SIMD bitwidth may not be known at the time the >> RTE_INIT(rte_net_crc_init) function is called. >> >> A simple approach to solve this issue would be to initialize the >> rte_net_crc_handler pointer to a handlers_default. The first time a >> crc is called, the rte_crc32_*_default_handler() function would check >> the configured SIMD bitwidth, and set the handler to the correct one, >> to avoid to do the test for next time. >> >> This approach still does not solve the case where the SIMD bitwidth is >> modified during the life of the application. In this case, a callback >> would have to be registered to notify SIMD bitwidth changes... but I >> don't think it is worth to do it. Instead, it can be documented that >> rte_set_max_simd_bitwidth() has to be called early, before >> rte_eal_init(). > >Actually I also thought about callback approach. >It does complicate things a bit for sure, but on a positive side - it allows to >solve RTE_INIT() code-path selection problem in a generic way, plus it means >zero changes in the data-path. >So probably worth to consider it. > I am not sure adding callbacks to allow for runtime changes to max SIMD bitwidth is worth it. I have sent a new version of my patchset which currently does not have this suggested rework to use callbacks. Thanks, Ciara <snip>