https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64410

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|NEW                         |ASSIGNED
             Blocks|                            |53947
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #5)
> (In reply to Marc Glisse from comment #1)
> > There are a number of things that make it complicated.
> > 1) gcc doesn't like to vectorize when the number of iterations is not known
> > at compile time.
> 
> Not an issue, we know it here (it's symbolic)
> 
> > 2) gcc doesn't vectorize anything already involving complex or vector
> > operations.
> 
> Indeed - here the issue is that we have C++ 'complex' aggregate
> load / store operations:
> 
>   _67 = MEM[(const struct complex &)_75];
>   __r$_M_value = _67;
> ...
>   _51 = REALPART_EXPR <__r$_M_value>;
>   REALPART_EXPR <__r$_M_value> = _104;
> ...
>   IMAGPART_EXPR <__r$_M_value> = _107;
>   _108 = __r$_M_value;
>   MEM[(struct cx_double *)_72] = _108;
> 
> which SRA for some reason didn't decompose as they are not aggregate
> (well, they are COMPLEX_TYPE).  They are not in SSA form either because
> they are partly written to.

And this forces it to be TREE_ADDRESSABLE.  Which means update-address-taken
might be a better candidate to fix this.

Note that it will still run into the issue that the vectorizer does not
like complex types (in loads), nor does it like building complex
registers via COMPLEX_EXPR.  After fixing update-address-taken we have

  __r$_M_value_70 = MEM[(const struct complex &)_78];
  _66 = MEM[(const double &)_77];
  _54 = REALPART_EXPR <__r$_M_value_70>;
  _105 = _54 + _66;
  _135 = IMAGPART_EXPR <__r$_M_value_70>;
  _106 = MEM[(const double &)_77 + 8];
  _107 = _106 + _135;
  __r$_M_value_180 = COMPLEX_EXPR <_105, _107>;
  MEM[(struct cx_double *)_76] = __r$_M_value_180;

which we ideally would have converted to piecewise loading / storing,
but the vectorizer may also be able to recover here with some twists.

> In this case it would have been profitable
> to SRA __r$_M_value.  Eventually this should have been complex lowerings
> job (but it doesn't try to decompose complex assignments).
> 
> > 3) the ABI for complex uses 2 separate double instead of a vector of 2
> > double.
> 
> I think that's unrelated.
> 
> > I believe there are dups at least for 2).

Reply via email to