Hi,

  I've been analyzing a failing regtest (gcc.dg/strlenopt-8.c) for the avr
  target. I found that the (dump) failure is because there are 4
  instances of memcpy, while the testcase expects only 2 for a
  non-strict align target like the avr.

  Comparing that with a dump generated by x64_64-pc-linux, I found that
  the extra memcpy's come from the forwprop pass, when it replaces
  strcat with strlen and memcpy. For x86_64, the memcpy generated gets
  folded into a load/store in gimple_fold_builtin_memory_op. That
  doesn't happen for the avr because len (2) happens to be bigger than
  MOVE_MAX (1).

  The avr can only move 1 byte efficiently from reg <-> memory, but it's
  more efficient to load and store 2 bytes than to call memcpy, so
  MOVE_MAX_PIECES is set to 2.

  Given that gimple_fold_builtin_memory_op gets to choose between
  leaving the memcpy call as is, or breaking it down to a by-pieces
  move, shouldn't it use MOVE_MAX_PIECES instead of
  MOV_MAX?

  That is what the below patch does, and that makes the test
  pass. Does this sound right?

Regards
Senthil

Index: gcc/gimple-fold.c
===================================================================
--- gcc/gimple-fold.c   (revision 242741)
+++ gcc/gimple-fold.c   (working copy)
@@ -703,7 +703,7 @@
       src_align = get_pointer_alignment (src);
       dest_align = get_pointer_alignment (dest);
       if (tree_fits_uhwi_p (len)
-         && compare_tree_int (len, MOVE_MAX) <= 0
+         && compare_tree_int (len, MOVE_MAX_PIECES) <= 0
          /* ???  Don't transform copies from strings with known length this
             confuses the tree-ssa-strlen.c.  This doesn't handle
             the case in gcc.dg/strlenopt-8.c which is XFAILed for that

Reply via email to