> > >> > > From: dev <dev-boun...@dpdk.org> On Behalf Of Ciara Power When
> > >> > > choosing a vector path to take, an extra condition must be
> > >> > > satisfied to ensure the max SIMD bitwidth allows for the CPU enabled
> > >path.
> > >> > >
> > >> > > The vector path was initially chosen in RTE_INIT, however this is
> > >> > > no longer suitable as we cannot check the max SIMD bitwidth at that
> > >time.
> > >> > > The default chosen in RTE_INIT is now scalar. For best performance
> > >> > > and to use vector paths, apps must explicitly call the set
> > >> > > algorithm function before using other functions from this library,
> > >> > > as this is where vector handlers are now chosen.
> > >> >
> > >> > [DC] Has it been decided that it is ok to now require applications
> > >> > to pick the CRC algorithm they want to use?
> > >> >
> > >> > An application which previously automatically got SSE4.2 CRC, for
> > >> > example, will now automatically only get scalar.
> > >> >
> > >> > If this is ok, this should probably be called out explicitly in
> > >> > release notes as it may not be Immediately noticeable to users that
> > >> > they now need to select the CRC algo.
> > >> >
> > >> > Actually, in general, the release notes need to be updated for this
> > >> patchset.
> > >>
> > >> The decision to move rte_set_alg() out of RTE_INIT was taken to avoid
> > >> check on max_simd_bitwidth in data path for every single time when
> > >> crc_calc() api is invoked. Based on my understanding,
> > >> max_simd_bitwidth is set after eal init, and when used in crc_calc(),
> > >> it might override the default crc algo set during RTE_INIT. Therefore,
> > >> to avoid extra check on max_simd_bitwidth in data path,  better option
> > >> will be to use this static configuration one time after eal init in the 
> > >> set_algo
> > >API.
> > >
> > >[DC] Yes that is a good change to have made to avoid extra datapath checks.
> > >
> > >Based on off-list discussion, I now also know the reason behind now
> > >defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was chosen 
> > >by
> > >RTE_INIT (e.g.
> > >SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD (64)
> > >through the EAL parameter or call to rte_set_max_simd_bitwidth(), then
> > >there is a mismatch if rte_net_crc_set_alg() is not then called to 
> > >reconfigure
> > >the CRC. Defaulting to scalar avoids this mismatch and works on all archs
> > >
> > >As I mentioned before, I think this needs to be called out in release 
> > >notes, as
> > >it's an under-the-hood change which could cause app performance to drop if
> > >app developers aren't aware of it - the API itself hasn't changed, so they 
> > >may
> > >not read the doxygen :)
> > >
> >
> > Yes that is a good point, I can add to the release notes for this to call 
> > it out.
> 
> I don't think it is a good idea to have the scalar crc by default.
> To me, the fastest available CRC has to be enabled by default.
> 
> I understand the technical reason why you did it like this however: the
> SIMD bitwidth may not be known at the time the
> RTE_INIT(rte_net_crc_init) function is called.
> 
> A simple approach to solve this issue would be to initialize the
> rte_net_crc_handler pointer to a handlers_default. The first time a crc
> is called, the rte_crc32_*_default_handler() function would check the
> configured SIMD bitwidth, and set the handler to the correct one, to
> avoid to do the test for next time.
> 
> This approach still does not solve the case where the SIMD bitwidth is
> modified during the life of the application. In this case, a callback
> would have to be registered to notify SIMD bitwidth changes... but I
> don't think it is worth to do it. Instead, it can be documented that
> rte_set_max_simd_bitwidth() has to be called early, before
> rte_eal_init().

Actually I also thought about callback approach.
It does complicate things a bit for sure, but on a positive side -
it allows to solve RTE_INIT() code-path selection problem
in a generic way, plus it means zero changes in the data-path. 
So probably worth to consider it.

> 
> 
> 
> > >>
> > >>
> > >> > >
> > >> > > Suggested-by: Jasvinder Singh <jasvinder.si...@intel.com>
> > >> > >
> > >> > > Signed-off-by: Ciara Power <ciara.po...@intel.com>
> > >> > >
> > >> > > ---
> > >> > > v3:
> > >> > >   - Moved choosing vector paths out of RTE_INIT.
> > >> > >   - Moved checking max_simd_bitwidth into the set_alg function.
> > >> > > ---
> > >> > >  lib/librte_net/rte_net_crc.c | 26 +++++++++++++++++---------
> > >> > > lib/librte_net/rte_net_crc.h |  3 ++-
> > >> > >  2 files changed, 19 insertions(+), 10 deletions(-)
> > >> > >
> > >> > > diff --git a/lib/librte_net/rte_net_crc.c
> > >> > > b/lib/librte_net/rte_net_crc.c index
> > >> > > 9fd4794a9d..241eb16399 100644
> > >> > > --- a/lib/librte_net/rte_net_crc.c
> > >> > > +++ b/lib/librte_net/rte_net_crc.c
> > >> >
> > >> > <snip>
> > >> >
> > >> > > @@ -145,18 +149,26 @@ rte_crc32_eth_handler(const uint8_t *data,
> > >> > > uint32_t data_len)  void  rte_net_crc_set_alg(enum rte_net_crc_alg
> > >> > > alg)  {
> > >> > > +    if (max_simd_bitwidth == 0)
> > >> > > +            max_simd_bitwidth = rte_get_max_simd_bitwidth();
> > >> > > +
> > >> > >      switch (alg) {
> > >> > >  #ifdef X86_64_SSE42_PCLMULQDQ
> > >> > >      case RTE_NET_CRC_SSE42:
> > >> > > -            handlers = handlers_sse42;
> > >> > > -            break;
> > >> > > +            if (max_simd_bitwidth >= RTE_MAX_128_SIMD) {
> > >> > > +                    handlers = handlers_sse42;
> > >> > > +                    return;
> > >> > > +            }
> > >> > > +            RTE_LOG(INFO, NET, "Max SIMD Bitwidth too low, using
> > >> > > scalar\n");
> > >> >
> > >> > [DC] Not sure if you're aware but there is another patchset which
> > >> > adds an
> > >> > AVX512 CRC implementation and run-time checking of cpuflags to
> > >> > select the CRC path to use:
> > >> > https://patchwork.dpdk.org/project/dpdk/list/?series=12596
> > >> >
> > >> > There will be a task to merge these 2 patchsets if both are merged.
> > >> > It looks fairly straightforward to me to merge these, but it would
> > >> > be good if you take a look too
> >
> > I have looked at that patchset, I agree, I think they will be 
> > straightforward to merge together.
> >
> > Thanks,
> > Ciara

Reply via email to