memcmp for powerpc

Aaron Sawdey Wed, 17 Oct 2018 13:29:02 -0700

I've previously posted a patch to add vector/vsx inline expansion of
strcmp/strncmp for the power8/power9 processors. Here are some of the
other items I have in the pipeline that I hope to get into gcc9:


* vector/vsx support for inline expansion of memcmp to non-loop code.
  This improves performance of small memcmp.
* vector/vsx support for inline expansion of memcmp to loop code. This
  will close the performance gap for lengths of about 128-512 bytes
  by making the loop code closer to the performance of the library
  memcmp.
* generate inline expansion to a loop for strcmp/strncmp. This closes
  another performance gap because strcmp/strncmp vector/vsx code
  currently generated is lots faster than the library call but we
  only generate comparison of 64 bytes to avoid exploding code size.
  Similar code in a loop would be compact and allow inline comparison
  of maybe the first 512 bytes inline before dumping to the library
  function.

If anyone has any other input on the inline expansion work I've been
doing for the rs6000 target, please let me know.

Thanks!
    Aaron


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

[RFC][GCC][rs6000] Remaining work for inline expansion of strncmp/strcmp/memcmp for powerpc

Reply via email to