https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88189
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org,
| |uros at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess the most important difference is that we if-convert the latter while
the former is only if-converted during RTL if-conversion. So, the best chances
to catch this would be in combine, but combiner doesn't try to combine the 4
needed instructions:
(insn 26 9 27 2 (set (reg:DF 87)
(lt:DF (reg/v:DF 83 [ a ])
(reg:DF 85))) "pr88189.c":6:16 656 {setcc_df_sse}
(expr_list:REG_DEAD (reg:DF 85)
(nil)))
(insn 27 26 28 2 (set (reg:DF 88)
(and:DF (reg/v:DF 83 [ a ])
(reg:DF 87))) "pr88189.c":6:16 1825 {*anddf3}
(expr_list:REG_DEAD (reg/v:DF 83 [ a ])
(nil)))
(insn 28 27 29 2 (set (reg:DF 89)
(and:DF (not:DF (reg:DF 87))
(reg/v:DF 84 [ b ]))) "pr88189.c":6:16 1820 {*andnotdf3}
(expr_list:REG_DEAD (reg:DF 87)
(expr_list:REG_DEAD (reg/v:DF 84 [ b ])
(nil))))
(insn 29 28 18 2 (set (reg:DF 82 [ <retval> ])
(ior:DF (reg:DF 89)
(reg:DF 88))) "pr88189.c":6:16 1826 {*iordf3}
(expr_list:REG_DEAD (reg:DF 89)
(expr_list:REG_DEAD (reg:DF 88)
(nil))))
it tries just 3, like:
Trying 26, 27 -> 29:
26: r87:DF=r83:DF<[`*.LC0']
27: r88:DF=r83:DF&r87:DF
REG_DEAD r83:DF
29: r82:DF=r89:DF|r88:DF
REG_DEAD r89:DF
REG_DEAD r88:DF
Can't combine i1 into i3
and
Trying 28, 27 -> 29:
28: r89:DF=~r87:DF&r91:DF
REG_DEAD r91:DF
REG_DEAD r87:DF
27: r88:DF=r83:DF&r87:DF
REG_DEAD r83:DF
29: r82:DF=r89:DF|r88:DF
REG_DEAD r89:DF
REG_DEAD r88:DF
Failed to match this instruction:
(set (reg:DF 82 [ <retval> ])
(ior:DF (and:DF (not:DF (reg:DF 87))
(reg:DF 91))
(and:DF (reg/v:DF 83 [ a ])
(reg:DF 87))))
The last one above is the most promising, but we'd need a guarantee that r87
has either -1, or 0 and nothing else (we do in this case, but not generally).
Because blendvpd is not a generic (x & ~y) | (z & y), but one that requires y
to have integral value [-1, 0].