Hi! The following testcase is miscompiled, because we have: (set (reg:SI ...) (subreg:SI (reg:QI (...)) 0)) instruction and the fwprop attempts to propagate (const_int -1) into the reg:QI use in there, but gives up because costs don't say it is beneficial and adds instead REG_EQUAL (const_int -1) note on the insn. That is wrong though, it is fine to optimize this insn to (set (reg:SI ...) (const_int -1)) because the higher bits in the paradoxical subreg are undefined, but as they are undefined, if the subreg is kept, those bits can be anything. If we say the subreg is equal to (const_int -1), it means e.g. CSE2 can replace other places that need SImode -1 with the SET_DEST of this insn, but then it really depends on what bits actually end up in the register. If we are unlucky, and it is e.g. spilled during LRA and reloaded using QImode, the upper bits can be anything.
Not sure if this patch catches everything though, perhaps there could be e.g. (set (reg:SI ...) (plus:SI ((subreg:SI (reg:QI ...) 0) (const_int ...))) and we'd still assign REG_EQUAL note. So maybe instead we should walk the *loc expression and look for paradoxical subregs, and for each of them, if we find the DF_REF_REG (use) mentioned in their operand, clear set_reg_equal. Though of course, if DF_REF_REG (use) itself is a paradoxical subreg, we could clear set_reg_equal without any walking. 2016-04-07 Jakub Jelinek <ja...@redhat.com> PR rtl-optimization/70574 * fwprop.c (forward_propagate_and_simplify): Don't add REG_EQUAL note if DF_REF_REG (use) is a paradoxical subreg. * gcc.target/i386/avx2-pr70574.c: New test. --- gcc/fwprop.c.jj 2016-01-04 14:55:53.000000000 +0100 +++ gcc/fwprop.c 2016-04-07 18:01:42.953844357 +0200 @@ -1213,7 +1213,7 @@ forward_propagate_and_simplify (df_ref u rtx_insn *use_insn = DF_REF_INSN (use); rtx use_set = single_set (use_insn); rtx src, reg, new_rtx, *loc; - bool set_reg_equal; + bool set_reg_equal = true; machine_mode mode; int asm_use = -1; @@ -1240,7 +1240,15 @@ forward_propagate_and_simplify (df_ref u /* Check if the use has a subreg, but the def had the whole reg. Unlike the previous case, the optimization is possible and often useful indeed. */ else if (GET_CODE (reg) == SUBREG && REG_P (SET_DEST (def_set))) - reg = SUBREG_REG (reg); + { + /* If the use is a paradoxical subreg, make sure we don't add a + REG_EQUAL note for it, because it is not equivalent, it is one + possible value for it, but we can't rely on it holding that value. + See PR70574. */ + if (paradoxical_subreg_p (reg)) + set_reg_equal = false; + reg = SUBREG_REG (reg); + } /* Make sure that we can treat REG as having the same mode as the source of DEF_SET. */ @@ -1301,13 +1309,13 @@ forward_propagate_and_simplify (df_ref u otherwise. We also don't want to install a note if we are merely propagating a pseudo since verifying that this pseudo isn't dead is a pain; moreover such a note won't help anything. */ - set_reg_equal = (note == NULL_RTX - && REG_P (SET_DEST (use_set)) - && !REG_P (src) - && !(GET_CODE (src) == SUBREG - && REG_P (SUBREG_REG (src))) - && !reg_mentioned_p (SET_DEST (use_set), - SET_SRC (use_set))); + set_reg_equal &= (note == NULL_RTX + && REG_P (SET_DEST (use_set)) + && !REG_P (src) + && !(GET_CODE (src) == SUBREG + && REG_P (SUBREG_REG (src))) + && !reg_mentioned_p (SET_DEST (use_set), + SET_SRC (use_set))); } if (GET_MODE (*loc) == VOIDmode) --- gcc/testsuite/gcc.target/i386/avx2-pr70574.c.jj 2016-04-07 18:09:25.788519218 +0200 +++ gcc/testsuite/gcc.target/i386/avx2-pr70574.c 2016-04-07 18:09:21.825573327 +0200 @@ -0,0 +1,26 @@ +/* PR rtl-optimization/70574 */ +/* { dg-do run { target lp64 } } */ +/* { dg-require-effective-target avx2 } */ +/* { dg-options "-O -frerun-cse-after-loop -fno-tree-ccp -mcmodel=medium -mavx2" } */ +/* { dg-additional-options "-fPIC" { target fpic } } */ + +#include "avx2-check.h" + +typedef char A __attribute__((vector_size (32))); +typedef short B __attribute__((vector_size (32))); + +int +foo (int x, __int128 y, __int128 z, A w) +{ + y <<= 64; + w *= (A) { 0, -1, z, 0, ~y }; + return w[0] + ((B) { x, 0, y, 0, -1 } | 1)[4]; +} + +static void +avx2_test () +{ + int x = foo (0, 0, 0, (A) {}); + if (x != -1) + __builtin_abort (); +} Jakub