https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115519

--- Comment #1 from Stefan Schulze Frielinghaus <stefansf at gcc dot gnu.org> 
---
For example, for function vesrlf_ge from vcond-shift.c we do not end up with

vl      %v2,0(%r2),3
vl      %v0,16(%r2),3
lgr     %r1,%r2
vesrlf  %v4,%v2,31
vesrlf  %v6,%v0,31
vst     %v4,0(%r1),3
vst     %v6,16(%r1),3
br      %r14

anymore but

vl      %v0,0(%r2),3
vl      %v4,16(%r2),3
vgmf    %v6,31,31
vzero   %v2
vesraf  %v1,%v0,31
vesraf  %v3,%v4,31
vsel    %v5,%v6,%v2,%v1
vsel    %v7,%v6,%v2,%v3
lgr     %r1,%r2
vst     %v5,0(%r1),3
vst     %v7,16(%r1),3
br      %r14

During a vcond expand we optimized x < 0 ? 1 : 0 into x >> 31 which we fail to
do, now.  Doing it late in combine fails, too, since we never come up with a
combination of insn 7, 8, 9, and 10:

(insn 7 6 8 2 (set (reg:V4SI 69 [ mask__5.8_4 ])
        (ashiftrt:V4SI (reg:V4SI 68 [ MEM <vector(4) int> [(int *)xx_10] ])
            (const_int 31 [0x1f]))) "vcond-shift.c":155:28 905 {*ashrv4si3}
     (expr_list:REG_DEAD (reg:V4SI 68 [ MEM <vector(4) int> [(int *)xx_10] ])
        (nil)))
(insn 8 7 9 2 (set (reg:V4SI 70)
        (const_vector:V4SI [
                (const_int 1 [0x1]) repeated x4
            ])) 410 {movv4si}
     (nil))
(insn 9 8 10 2 (set (reg:V4SI 71)
        (const_vector:V4SI [
                (const_int 0 [0]) repeated x4
            ])) 410 {movv4si}
     (nil))
(insn 10 9 11 2 (set (reg:V4SI 62 [ vect_patt_18.9 ])
        (if_then_else:V4SI (eq (reg:V4SI 69 [ mask__5.8_4 ])
                (const_vector:V4SI [
                        (const_int 0 [0]) repeated x4
                    ]))
            (reg:V4SI 71)
            (reg:V4SI 70))) 1265 {*vec_sel0v4si}
     (expr_list:REG_DEAD (reg:V4SI 69 [ mask__5.8_4 ])
        (expr_list:REG_EQUAL (if_then_else:V4SI (eq (reg:V4SI 69 [ mask__5.8_4
])
                    (const_vector:V4SI [
                            (const_int 0 [0]) repeated x4
                        ]))
                (const_vector:V4SI [
                        (const_int 0 [0]) repeated x4
                    ])
                (const_vector:V4SI [
                        (const_int 1 [0x1]) repeated x4
                    ]))
            (nil))))

So maybe this is something for match.pd?

Reply via email to