On Thu, Apr 23, 2015 at 2:23 AM, Ananyev, Konstantin < konstantin.ananyev at intel.com> wrote:
> > > > -----Original Message----- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson > > Sent: Thursday, April 23, 2015 9:12 AM > > To: Wodkowski, PawelX > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH] Implement memcmp using AVX/SSE instructio > > > > On Thu, Apr 23, 2015 at 09:24:52AM +0200, Pawel Wodkowski wrote: > > > On 2015-04-22 17:33, Ravi Kerur wrote: > > > >+/** > > > >+ * Compare bytes between two locations. The locations must not > overlap. > > > >+ * > > > >+ * @note This is implemented as a macro, so it's address should not > be taken > > > >+ * and care is needed as parameter expressions may be evaluated > multiple times. > > > >+ * > > > >+ * @param src_1 > > > >+ * Pointer to the first source of the data. > > > >+ * @param src_2 > > > >+ * Pointer to the second source of the data. > > > >+ * @param n > > > >+ * Number of bytes to compare. > > > >+ * @return > > > >+ * true if equal otherwise false. > > > >+ */ > > > >+static inline bool > > > >+rte_memcmp(const void *src_1, const void *src, > > > >+ size_t n) __attribute__((always_inline)); > > > You are exposing this as public API, so I think you should follow > > > description bellow or not call this _memcmp_ > > > > > > int memcmp(const void *s1, const void *s2, size_t n); > > > > > > The memcmp() function returns an integer less than, equal to, or > greater > > > than > > > zero if the first n bytes of s1 is found, respectively, > to be > > > less than, to > > > match, or be greater than the first n bytes of s2. > > > > > > > +1 to this point. > > > > Also, if I read your quoted performance numbers in your earlier mail > correctly, > > we are only looking at a 1-4% performance increase. Is the additional > code to > > maintain worth the benefit? > > Yep, same thought here, is it really worth it? > Konstantin > > > > > /Bruce > > > > > -- > > > Pawel > I think I haven't exploited every thing x86 has to offer to improve performance. I am looking for inputs. Until we have exhausted all avenues I don't want to drop it. One thing I have noticed is that bigger key size gets better performance numbers. I plan to re-run perf tests with 64 and 128 bytes key size and will report back. Any other avenues to try out please let me know I will give it a shot. Thanks, Ravi