https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64410
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|NEW |ASSIGNED Blocks| |53947 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Biener from comment #5) > (In reply to Marc Glisse from comment #1) > > There are a number of things that make it complicated. > > 1) gcc doesn't like to vectorize when the number of iterations is not known > > at compile time. > > Not an issue, we know it here (it's symbolic) > > > 2) gcc doesn't vectorize anything already involving complex or vector > > operations. > > Indeed - here the issue is that we have C++ 'complex' aggregate > load / store operations: > > _67 = MEM[(const struct complex &)_75]; > __r$_M_value = _67; > ... > _51 = REALPART_EXPR <__r$_M_value>; > REALPART_EXPR <__r$_M_value> = _104; > ... > IMAGPART_EXPR <__r$_M_value> = _107; > _108 = __r$_M_value; > MEM[(struct cx_double *)_72] = _108; > > which SRA for some reason didn't decompose as they are not aggregate > (well, they are COMPLEX_TYPE). They are not in SSA form either because > they are partly written to. And this forces it to be TREE_ADDRESSABLE. Which means update-address-taken might be a better candidate to fix this. Note that it will still run into the issue that the vectorizer does not like complex types (in loads), nor does it like building complex registers via COMPLEX_EXPR. After fixing update-address-taken we have __r$_M_value_70 = MEM[(const struct complex &)_78]; _66 = MEM[(const double &)_77]; _54 = REALPART_EXPR <__r$_M_value_70>; _105 = _54 + _66; _135 = IMAGPART_EXPR <__r$_M_value_70>; _106 = MEM[(const double &)_77 + 8]; _107 = _106 + _135; __r$_M_value_180 = COMPLEX_EXPR <_105, _107>; MEM[(struct cx_double *)_76] = __r$_M_value_180; which we ideally would have converted to piecewise loading / storing, but the vectorizer may also be able to recover here with some twists. > In this case it would have been profitable > to SRA __r$_M_value. Eventually this should have been complex lowerings > job (but it doesn't try to decompose complex assignments). > > > 3) the ABI for complex uses 2 separate double instead of a vector of 2 > > double. > > I think that's unrelated. > > > I believe there are dups at least for 2).