https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rdapp at gcc dot gnu.org --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to JuzheZhong from comment #8) > It's because the order of the operations we are doing: > > For code as follows: > > result += mask ? a[i] + x : 0; > > GCC: > result_ssa_1 = PHI <result_ssa_2, 0> > ... > STMT 1. tmp = a[i] + x; > STMT 2. tmp2 = tmp + result_ssa_1; > STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1; > > Here we can see both STMT 2 and STMT 3 are using 'result_ssa_1', > we end up with 2 uses of the PHI result. Then, we failed to vectorize. > > Wheras LLVM: > > result_ssa_1 = PHI <result_ssa_2, 0> > ... > IR 1. tmp = a[i] + x; > IR 2. tmp2 = mask ? tmp : 0; > IR 3. result_ssa_2 = tmp2 + result_ssa_1. For floating point these are not equivalent (adding zero isn't a no-op). > LLVM only has 1 use. > > Is it reasonable to swap the order in match.pd ? if-conversion could be teached to swap this (it's if-conversion creating the IL for conditional reductions) when valid. IIRC Robin Dapp also has a patch to make if-conversion emit .COND_ADD instead which should make it even better to vectorize.