https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220
Bug ID: 99220 Summary: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: tnfchris at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64-* The following testcase class a { float b; float c; public: a(float d, float e) : b(d), c(e) {} a operator+(a d) { return a(b + d.b, c + d.c); } a operator-(a d) { return a(b - d.b, c - d.c); } a operator*(a d) { return a(b * b - c * c, b * c + c * d.b); } }; long f; a *g; class { a *h; long i; a *j; public: void k() { a l = h[0], m = g[i], n = l * g[1], o = l * j[8]; g[i] = m + n; g[i + 1] = m - n; j[f] = o; } } p; main() { p.k(); } crashes with aarch64-none-elf-g++ -w -march=armv8.3-a -O3 -S main.cpp because two nodes end up with the same pointer. During the loop that analyzes all the instances during optimize_load_redistribution_1 we do if (value) { SLP_TREE_REF_COUNT (value)++; SLP_TREE_CHILDREN (root)[i] = value; vect_free_slp_tree (node); } when doing a replacement. When this is done and the refcount for the node reaches 0, the node is removed, which allows the libc to return the pointer again in the next call to new, which it does.. First instance note: node 0x5325f48 (max_nunits=1, refcnt=2) note: op: VEC_PERM_EXPR note: { } note: lane permutation { 0[0] 1[1] 0[2] 1[3] } note: children 0x5325db0 0x5325200 Second instance note: node 0x5325f48 (max_nunits=1, refcnt=1) note: op: VEC_PERM_EXPR note: { } note: lane permutation { 0[0] 1[1] } note: children 0x53255b8 0x5325530 This will end up with the illegal construction of note: node 0x53258e8 (max_nunits=2, refcnt=2) note: op template: slp_patt_57 = .COMPLEX_MUL (_16, _16); note: stmt 0 _16 = _14 - _15; note: stmt 1 _23 = _17 + _22; note: children 0x53257d8 0x5325d28 note: node 0x53257d8 (max_nunits=2, refcnt=3) note: op template: l$b_4 = MEM[(const struct a &)_3].b; note: stmt 0 l$b_4 = MEM[(const struct a &)_3].b; note: stmt 1 l$c_5 = MEM[(const struct a &)_3].c; note: load permutation { 0 1 } note: node 0x5325d28 (max_nunits=2, refcnt=8) note: op template: l$b_4 = MEM[(const struct a &)_3].b; note: stmt 0 l$b_4 = MEM[(const struct a &)_3].b; note: stmt 1 l$c_5 = MEM[(const struct a &)_3].c; note: stmt 2 l$b_4 = MEM[(const struct a &)_3].b; note: stmt 3 l$c_5 = MEM[(const struct a &)_3].c; note: load permutation { 0 1 0 1 } To prevent this we need to add these temporary VEC_PERM_EXPR nodes to the bst_map cache and increase their refcnt one more.