Hi,
This patch makes sure that allocno copies are not created for unordered
modes. The testcases in the PR highlighted a case where an allocno copy
was being created for:
(insn 121 120 123 11 (parallel [
(set (reg:VNx2QI 217)
(vec_duplicate:VNx2QI (subreg/s/v:QI (reg:SI 93 [ _2 ])
0)))
(clobber (scratch:VNx16BI))
]) 4750 {*vec_duplicatevnx2qi_reg}
(expr_list:REG_DEAD (reg:SI 93 [ _2 ])
(nil)))
As the compiler detected that the vec_duplicate<mode>_reg pattern
allowed the input and output operand to be of the same register class,
it tried to create an allocno copy for these two operands, stripping
subregs in the process. However, this meant that the copy was between
VNx2QI and SI, which have unordered mode precisions.
So at compile time we do not know which of the two modes is smaller
which is a requirement when updating allocno copy costs.
Regression tested on aarch64-linux-gnu.
Is this OK for trunk (and after a week backport to gcc-10) ?
Regards,
Andre
gcc/ChangeLog:
2021-02-19 Andre Vieira <andre.simoesdiasvie...@arm.com>
PR rtl-optimization/98791
* ira-conflicts.c (process_regs_for_copy): Don't create allocno
copies for unordered modes.
gcc/testsuite/ChangeLog:
2021-02-19 Andre Vieira <andre.simoesdiasvie...@arm.com>
PR rtl-optimization/98791
* gcc.target/aarch64/sve/pr98791.c: New test.
diff --git a/gcc/ira-conflicts.c b/gcc/ira-conflicts.c
index
2c2234734c3166872d94d94c5960045cb89ff2a8..d83cfc1c1a708ba04f5e01a395721540e31173f0
100644
--- a/gcc/ira-conflicts.c
+++ b/gcc/ira-conflicts.c
@@ -275,7 +275,10 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool
constraint_p,
ira_allocno_t a1 = ira_curr_regno_allocno_map[REGNO (reg1)];
ira_allocno_t a2 = ira_curr_regno_allocno_map[REGNO (reg2)];
- if (!allocnos_conflict_for_copy_p (a1, a2) && offset1 == offset2)
+ if (!allocnos_conflict_for_copy_p (a1, a2)
+ && offset1 == offset2
+ && ordered_p (GET_MODE_PRECISION (ALLOCNO_MODE (a1)),
+ GET_MODE_PRECISION (ALLOCNO_MODE (a2))))
{
cp = ira_add_allocno_copy (a1, a2, freq, constraint_p, insn,
ira_curr_loop_tree_node);
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c
b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c
new file mode 100644
index
0000000000000000000000000000000000000000..ee0c7b51602cacd45f9e33acecb1eaa9f9edebf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c
@@ -0,0 +1,12 @@
+/* PR rtl-optimization/98791 */
+/* { dg-do compile } */
+/* { dg-options "-O -ftree-vectorize --param=aarch64-autovec-preference=3" } */
+extern char a[], b[];
+short c, d;
+long *e;
+void f() {
+ for (int g; g < c; g += 1) {
+ a[g] = d;
+ b[g] = e[g];
+ }
+}