Hi,

On Wed, 15 May 2019, Aaron Sawdey wrote:

> Yes this would be a nice thing to get to, a single move/copy underlying 
> builtin, to which we communicate what the compiler's analysis tells us 
> about whether the operands overlap and by how much.
> 
> Next question would be how do we move from the existing movmem pattern 
> (which Michael Matz tells us should be renamed cpymem anyway) to this 
> new thing. Are you proposing that we still have both movmem and cpymem 
> optab entries underneath to call the patterns but introduce this new 
> memmove_with_hints() to be used by things called by 
> expand_builtin_memmove() and expand_builtin_memcpy()?

I'd say so.  There are multiple levels at play:
a) exposal to user: probably a new __builtint_memmove, or a new combined 
   builtin with a hint param to differentiate (but we can't get rid of 
   __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
   route in the middleend)
b) getting it through the gimple pipeline, probably just a new builtin 
   code, trivial
c) expanding the new builtin, with the help of next items
d) RTL block moves: they are defined as non-overlapping and I don't think 
   we should change this (essentially they're the reflection of struct 
   copies in C)
e) how any of the above (builtins and RTL block moves) are implemented: 
   currently non-overlapping only, using movmem pattern when possible; 
   ultimately all sitting in the emit_block_move_hints() routine.

So, I'd add a new method to emit_block_move_hints indicating possible 
overlap, disabling the use of move_by_pieces.  Then in 
emit_block_move_via_movmem (alse getting an indication of overlap), do the 
equivalent of:

  finished = 0;
  if (overlap_possible) {
    if (optab[movmem])
      finished = emit(movmem)
  } else {
    if (optab[cpymem])
      finished = emit(cpymem);
    if (!finished && optab[movmem])  // can use movmem also for overlap
      finished = emit(movmem);
  }

The overlap_possible method would only ever be used from the builtin 
expansion, and never from the RTL block move expand.  Additionally a 
target may optionally only define the movmem pattern if it's just as good 
as the cpymem pattern (e.g. because it only handles fixed small sizes and 
uses a load-all then store-all sequence).


Ciao,
Michael.

Reply via email to