> -----Original Message----- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Friday, October 13, 2017 15:36 > To: Ananyev, Konstantin <konstantin.anan...@intel.com>; Li, Xiaoyun > <xiaoyun...@intel.com> > Cc: dev@dpdk.org; Richardson, Bruce <bruce.richard...@intel.com>; Lu, > Wenzhuo <wenzhuo...@intel.com>; Zhang, Helin <helin.zh...@intel.com> > Subject: Re: [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over > memcpy > > 13/10/2017 09:31, Ananyev, Konstantin: > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > 13/10/2017 03:06, Li, Xiaoyun: > > > > Hi > > > > Sorry for the late reply. I took AL last 3 days. > > > > > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > > > 05/10/2017 14:33, Xiaoyun Li: > > > > > > +/** > > > > > > + * Macro for copying unaligned block from one location to > > > > > > +another with constant load offset, > > > > > > + * 47 bytes leftover maximum, > > > > > > + * locations should not overlap. > > > > > > + * Requirements: > > > > > > + * - Store is aligned > > > > > > + * - Load offset is <offset>, which must be immediate value > > > > > > +within [1, 15] > > > > > > + * - For <src>, make sure <offset> bit backwards & <16 - > > > > > > +offset> bit forwards are available for loading > > > > > > + * - <dst>, <src>, <len> must be variables > > > > > > + * - __m128i <xmm0> ~ <xmm8> must be pre-defined */ #define > > > > > > +MOVEUNALIGNED_LEFT47_IMM(dst, src, len, > > > > > > > > > > Naive question: > > > > > Is there a real benefit of using a macro compared to a static > > > > > inline function optimized by a modern compiler? > > > > > > > > > The macro is in the existing DPDK codes. I didn't touch it. I just > > > > change > the file name and the function name to rte_memcpy_internal. > > > > So I am not clear about if there is real benefit. > > > > In my opinion, I think it is the same as static inline function. > > > > > > > > Do I need to change them to inline function? > > > > > > In this patch, it appears as a new macro. > > > > Ah no, it definitely been there before. > > All we did here - git mv rte_memcpy.h rte_memcpyu_interlan.h and then > > in rte_memcpy_internal.h renamed rte_memcpy() to > rte_memcpy_internal(). > > > > > If you can, inline function is cleaner for the new one. > > > > I don't think it will be straightforward - one of the parameters is a > > constant > value. > > My preference would be to keep original rte_memcpy() code intact as > > much as we can here (except probably cosmetic changes - indentation, line > length fixing etc.). > > After all that patch is for adding architecture function selection at > > runtime > only. > > If we like to improve our rte_memcpy() any furher - NP with that, but > > let it be a separate patch. > > OK > Then I will just modify indentation and line length fix and keep the original macro.
> I am waiting this patch to close RC1 today. I will do it ASAP. Best Regards Xiaoyun Li