https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
Bug ID: 116103 Summary: [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]" Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: testsuite-fail Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: ams at gcc dot gnu.org, pan2.li at intel dot com Target Milestone: --- Target: GCN With recent commit r15-2241-g905973410957891fec8a3e42eeefa4618780e0ce "Internal-fn: Only allow modes describe types for internal fn[PR115961]", we've got a few regressions for '--target=amdgcn-amdhsa' (tested '-march=gfx908'). >From a quick glance, I can't tell if this is worse or just different code generation. (Andrew?) PASS: gcc.dg/tree-ssa/loop-bound-2.c (test for excess errors) FAIL: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump ivopts "bounded by 254" PASS: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts "bounded by 255" [-PASS:-]{+FAIL:+} gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts "zero if " Note that 'scan-tree-dump ivopts "bounded by 254"' already did FAIL before, but the FAIL of 'scan-tree-dump-not ivopts "zero if "' is new: --- G/loop-bound-2.c.188t.ivopts 2024-07-26 09:34:22.838958365 +0200 +++ B/loop-bound-2.c.188t.ivopts 2024-07-26 09:47:10.822525365 +0200 @@ -5,15 +5,22 @@ ;; Loop 1 ;; header 3, latch 6 ;; depth 1, outer 0, finite_p -;; niter scev_not_known +;; niter (unsigned short) bnd.8_23 + 63 > 63 ? ((unsigned short) bnd.8_23 + 65535) / 64 : 0 ;; upper_bound 3 ;; likely_upper_bound 3 ;; iterations by profile: 3.000000 (unreliable) entry count:105119324 (estimated locally, freq 0.8900) ;; nodes: 3 6 Processing loop 1 at source-gcc/gcc/testsuite/gcc.dg/tree-ssa/loop-bound-2.c:14 - single exit 3 -> 8, exit condition if (next_mask_38 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) + single exit 3 -> 8, exit condition if (ivtmp_35 > 64) +Analyzing # of iterations of loop 1 + exit condition 64 < [(unsigned short) bnd.8_23, + , 65472] + bounds on difference of bases: -64 ... 65471 + result: + zero if (unsigned short) bnd.8_23 + 63 <= 63 + # of iterations ((unsigned short) bnd.8_23 + 65535) / 64, bounded by 1023 + number of iterations ((unsigned short) bnd.8_23 + 65535) / 64; zero if (unsigned short) bnd.8_23 + 63 <= 63 [...] And then, a number of regressions of 'scan-assembler-times \\tv_cmp_gt_i32\\tvcc, [...]' and 'scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, [...]': @@ -125843,7 +125901,7 @@ PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times smaxv64si3_exec 30 @@ -125854,7 +125912,7 @@ PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times sminv64si3_exec 30 @@ -125864,7 +125922,7 @@ PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors) PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times umaxv64si3_exec 20 @@ -125874,7 +125932,7 @@ PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors) PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times uminv64si3_exec 20 @@ -126401,13 +126459,13 @@ PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivv64si3@rel32@lo 0 PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __umodv64si3@rel32@lo 0 PASS: gcc.target/gcn/smax_1.c (test for excess errors) PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/smax_1.c scan-assembler-times vec_cmpv64didi 10 PASS: gcc.target/gcn/smax_1_run.c (test for excess errors) PASS: gcc.target/gcn/smax_1_run.c execution test PASS: gcc.target/gcn/smin_1.c (test for excess errors) PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/smin_1.c scan-assembler-times vec_cmpv64didi 10 PASS: gcc.target/gcn/smin_1_run.c (test for excess errors) PASS: gcc.target/gcn/smin_1_run.c execution test @@ -126433,13 +126491,13 @@ PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors) PASS: gcc.target/gcn/sram-ecc-8.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift) PASS: gcc.target/gcn/umax_1.c (test for excess errors) PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/umax_1.c scan-assembler-times vec_cmpv64didi 8 PASS: gcc.target/gcn/umax_1_run.c (test for excess errors) PASS: gcc.target/gcn/umax_1_run.c execution test PASS: gcc.target/gcn/umin_1.c (test for excess errors) PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/umin_1.c scan-assembler-times vec_cmpv64didi 8 PASS: gcc.target/gcn/umin_1_run.c (test for excess errors) PASS: gcc.target/gcn/umin_1_run.c execution test