Hi Richard,
For the test-case mentioned in PR111702, compiling with -O2
-frounding-math -fstack-protector-all results in following ICE during
cse2 pass:
test.c: In function 'foo':
test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120
119 | }
| ^
0xb7ebb0 insert_regs
../../gcc/gcc/cse.cc:1120
0x1f95134 merge_equiv_classes
../../gcc/gcc/cse.cc:1764
0x1f9b9ab cse_insn
../../gcc/gcc/cse.cc:4793
0x1f9fe30 cse_extended_basic_block
../../gcc/gcc/cse.cc:6577
0x1f9fe30 cse_main
../../gcc/gcc/cse.cc:6722
0x1fa0984 rest_of_handle_cse2
../../gcc/gcc/cse.cc:7620
0x1fa0984 execute
../../gcc/gcc/cse.cc:7675
This happens only with interleave+zip1 vector initialization with
-frounding-math -fstack-protector-all, while it compiles OK without
-fstack-protector-all. Also, it compiles OK with fallback sequence
code-gen (with or without -fstack-protector-all). Unfortunately, I
haven't been able to reduce the test-case further :/
>From the test-case, it seems only the vector initializer for type J
uses interleave+zip1 approach, while rest of the vector initializers
use fallback sequence.
J is defined as:
typedef _Float16 __attribute__((__vector_size__ (16))) J;
and the initializer is:
(J) { 11654, 4801, 5535, 9743, 61680}
interleave+zip1 sequence for above initializer J:
mode = V8HF
vals: (parallel:V8HF [
(reg:HF 642)
(reg:HF 645)
(reg:HF 648)
(reg:HF 651)
(reg:HF 654)
(const_double:HF 0.0 [0x0.0p+0]) repeated x3
])
target: (reg:V8HF 641)
seq:
(insn 1058 0 1059 (set (reg:V4HF 657)
(const_vector:V4HF [
(const_double:HF 0.0 [0x0.0p+0]) repeated x4
])) "test.c":81:8 -1
(nil))
(insn 1059 1058 1060 (set (reg:V4HF 657)
(vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642))
(reg:V4HF 657)
(const_int 1 [0x1]))) "test.c":81:8 -1
(nil))
(insn 1060 1059 1061 (set (reg:V4HF 657)
(vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648))
(reg:V4HF 657)
(const_int 2 [0x2]))) "test.c":81:8 -1
(nil))
(insn 1061 1060 1062 (set (reg:V4HF 657)
(vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654))
(reg:V4HF 657)
(const_int 4 [0x4]))) "test.c":81:8 -1
(nil))
(insn 1062 1061 1063 (set (reg:V4HF 658)
(const_vector:V4HF [
(const_double:HF 0.0 [0x0.0p+0]) repeated x4
])) "test.c":81:8 -1
(nil))
(insn 1063 1062 1064 (set (reg:V4HF 658)
(vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645))
(reg:V4HF 658)
(const_int 1 [0x1]))) "test.c":81:8 -1
(nil))
(insn 1064 1063 1065 (set (reg:V4HF 658)
(vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651))
(reg:V4HF 658)
(const_int 2 [0x2]))) "test.c":81:8 -1
(nil))
(insn 1065 1064 0 (set (reg:V8HF 641)
(unspec:V8HF [
(subreg:V8HF (reg:V4HF 657) 0)
(subreg:V8HF (reg:V4HF 658) 0)
] UNSPEC_ZIP1)) "test.c":81:8 -1
(nil))
It seems to me that the above sequence correctly initializes the
vector into r641 ?
insns 1058-1061 construct r657 = { r642, r648, r654, 0 }
insns 1062-1064 construct r658 = { r645, r651, 0, 0 }
and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 }
For the above test, it seems that with interleave+zip1 approach and
-fstack-protector-all,
in cse pass, there are two separate equivalence classes created for
(const_int 1), that need
to be merged in cse_insn:
if (elt->first_same_value != src_eqv_elt->first_same_value)
{
/* The REG_EQUAL is indicating that two formerly distinct
classes are now equivalent. So merge them. */
merge_equiv_classes (elt, src_eqv_elt);
elt equivalence chain:
Equivalence chain for (subreg:QI (reg:V16QI 671) 0):
(subreg:QI (reg:V16QI 671) 0)
(const_int 1 [0x1])
src_eqv_elt equivalence chain:
Equivalence chain for (const_int 1 [0x1]):
(reg:QI 34 v2)
(reg:QI 32 v0)
(reg:QI 34 v2)
(const_int 1 [0x1])
(vec_select:QI (reg:V16QI 671)
(parallel [
(const_int 1 [0x1])
]))
(vec_select:QI (reg:V16QI 32 v0)
(parallel [
(const_int 1 [0x1])
]))
(vec_select:QI (reg:V16QI 33 v1)
(parallel [
(const_int 2 [0x2])
]))
(vec_select:QI (reg:V16QI 33 v1)
(parallel [
(const_int 1 [0x1])
]))
The issue is that merge_equiv_classes doesn't seem to deal correctly with
multiple occurences of same register in class2 (src_eqv_elt), which
has two occurrences of
(reg:QI 34 v2)
In merge_equiv_classes, on first iteration, it will remove (reg:QI 34)
from reg_equiv_table
by calling delete_equiv_reg(34), and in insert_regs it will create an
entry for (reg:QI 34) in qty_table with new quantity number, and
create new equivalence in reg_eqv_table.
When we again come across (reg:QI 34) in class2, it will
unconditionally remove the register from reg_eqv_table, thus making
REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1
chain.
Then in insert_regs, we have:
x: (reg:QI 34 v2)
classp:
(subreg:QI (reg:V16QI 671) 0)
(reg:QI 34 v2)
(const_int 1 [0x1])
And while iterating over elements in classp, we end up with regno ==
c_regno == 34.
However, as mentioned above, merge_equiv_classes has deleted entry for
(reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end
up hitting the following assert:
gcc_assert (REGNO_QTY_VALID_P (c_regno));
I am not sure tho why this is triggered only with interleave+zip1 approach with
-fstack-protector-all.
The attached (untested) patch is a workaround for the above issue --
In merge_equiv_classes,
while iterating over elements in class2, it simply checks if element
is a reg, and already inserted in class1 with equivalent mode, and
avoids deleting it from reg_eqv_table in that case.
This avoids hitting the assert, and following is the result of merging
above two classes:
Equivalence chain for (subreg:QI (reg:V16QI 671) 0):
(subreg:QI (reg:V16QI 671) 0)
(reg:QI 34 v2)
(reg:QI 32 v0)
(reg:QI 34 v2)
(const_int 1 [0x1])
(const_int 1 [0x1])
(vec_select:QI (reg:V16QI 671)
(parallel [
(const_int 1 [0x1])
]))
(vec_select:QI (reg:V16QI 33 v1)
(parallel [
(const_int 1 [0x1])
]))
(vec_select:QI (reg:V16QI 33 v1)
(parallel [
(const_int 2 [0x2])
]))
(vec_select:QI (reg:V16QI 32 v0)
(parallel [
(const_int 1 [0x1])
]))
Which seems to be OK (?), but am not sure if this patch is in the
right direction,
and is also not efficient.
Could you please suggest how to proceed ?
Thanks,
Prathamesh
diff --git a/gcc/cse.cc b/gcc/cse.cc
index f9603fdfd43..1e20be457c4 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -1747,7 +1747,16 @@ merge_equiv_classes (struct table_elt *class1, struct
table_elt *class2)
if (REG_P (exp))
{
need_rehash = REGNO_QTY_VALID_P (REGNO (exp));
- delete_reg_equiv (REGNO (exp));
+
+ /* If reg is already inserted into class1 and has a valid new
+ quantity, avoid deleting it from reg_eqv_table. */
+ table_elt *e;
+ for (e = class1->first_same_value; e; e = e->next_same_value)
+ if (REG_P (e->exp) && REGNO (e->exp) == REGNO (exp)
+ && e->mode == mode)
+ break;
+ if (e == NULL)
+ delete_reg_equiv (REGNO (exp));
}
if (REG_P (exp) && REGNO (exp) >= FIRST_PSEUDO_REGISTER)