On 07/30/16 13:39, Segher Boessenkool wrote:
> On Sat, Jul 30, 2016 at 08:17:59AM +0000, Bernd Edlinger wrote:
>> Ok, I the incorrect REG_POINTER is done here:
>>
>> cse_main -> reg_scan -> reg_scan_mark_refs -> set_reg_attrs_from_value
>>
>> and here I see a bug, because if POINTERS_EXTEND_UNSIGNED
>> can_be_reg_pointer is only set to false if x is SIGN_EXTEND but not
>> if x is a SUBREG as in this case.
>>
>> So I think that should be fixed this way:
>>
>> Index: emit-rtl.c
>> ===================================================================
>> --- emit-rtl.c (revision 238891)
>> +++ emit-rtl.c (working copy)
>> @@ -1155,7 +1155,7 @@ set_reg_attrs_from_value (rtx reg, rtx x)
>> || (GET_CODE (x) == SUBREG && subreg_lowpart_p (x)))
>> {
>> #if defined(POINTERS_EXTEND_UNSIGNED)
>> - if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED)
>> + if (((GET_CODE (x) != ZERO_EXTEND && POINTERS_EXTEND_UNSIGNED)
>> || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED))
>> && !targetm.have_ptr_extend ())
>> can_be_reg_pointer = false;
>>
>>
>> What do you think does this look like the right fix?
>
> There also is (from rtl.texi), for paradoxical subregs:
>
> """
> @item @code{subreg} of @code{reg}s
> The upper bits are defined when @code{SUBREG_PROMOTED_VAR_P} is true.
> @code{SUBREG_PROMOTED_UNSIGNED_P} describes what the upper bits hold.
> Such subregs usually represent local variables, register variables
> and parameter pseudo variables that have been promoted to a wider mode.
> """
>
> so you might want to test for these as well.
>
like this?
Index: emit-rtl.c
===================================================================
--- emit-rtl.c (revision 238891)
+++ emit-rtl.c (working copy)
@@ -1156,7 +1156,11 @@
{
#if defined(POINTERS_EXTEND_UNSIGNED)
if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED)
- || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED))
+ || (GET_CODE (x) == ZERO_EXTEND && ! POINTERS_EXTEND_UNSIGNED)
+ || (paradoxical_subreg_p (x)
+ && ! (SUBREG_PROMOTED_VAR_P (x)
+ && SUBREG_CHECK_PROMOTED_SIGN (x,
+ POINTERS_EXTEND_UNSIGNED))))
&& !targetm.have_ptr_extend ())
can_be_reg_pointer = false;
#endif
In the test case of pr71779 the subreg is no promoted var, so this
has no influence at this time. Also I have not POINTERS_EXTEND_SIGNED
target, but for the symmetry it ought to check explicitly for
ZERO_EXTEND as well, and allow the pointer value to pass thru a
TRUNCATE.
>> With this patch the code the reg/f:DI 481 does no longer appear,
>> and also the invalid combine does no longer happen.
>
> Good :-)
>
>> However the test case from pr70903 does not get fixed by this.
>>
>> But when I look at the dumps, I see again the first incorrect
>> transformation in cse2 (again cse!):
>>
>> (insn 20 19 21 2 (set (reg:DI 94)
>> (subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64}
>> (expr_list:REG_EQUAL (const_int 255 [0xff])
>> (expr_list:REG_DEAD (reg:QI 93)
>> (nil))))
>>
>> but that is simply wrong, because later optimization passes
>> expect reg 94 to be 0x000000ff but the upper bits are unspecified,
>> so that REG_EQUAL should better not exist.
>
> Agreed. So where does that come from?
>
>> When I looked at cse.c I saw a comment in #if 0, which exactly
>> describes the problem that we have with paradoxical subreg here:
>
> <snip>
>
>> I am pretty sure, that this has to be reverted, and that it can
>> generate less efficient code.
>>
>> What do you think?
>
> I think this pessimises the generated code too much; there must be a
> better solution.
>
I debugged the cse again, to see how it works and why it mis-compiles
this example.
I found out that the trouble starts one instruction earlier:
(insn 19 40 20 2 (set (subreg:DI (reg:QI 93) 0)
) pr.c:8 50 {*movdi_aarch64}
(expr_list:REG_DEAD (reg:DI 110 [ D.3037 ])
(nil)))
cse_main sees the constant value and maps:
(reg:QI 93) => (const_int 255 [0xff])
plus (I mean that is wrong):
(subreg:DI (reg:QI 93) 0) => (const_int 255 [0xff])
When the next insn is scanned
(insn 20 19 21 2 (set (reg:DI 94)
(subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64}
(expr_list:REG_DEAD (reg:QI 93)
(nil)))
I see fold_rtx (subreg:DI (reg:QI 93) 0))
return (const_int 255 [0xff]) which is wrong.
now cse maps:
(reg:DI 94) => (const_int 255 [0xff])
which is also not guaranteed to be correct, but the REG_EQUAL at
insn 20 is probably only a symptom, suppressing only that does not
fix the test case, because later we have:
(insn 25 24 26 2 (set (reg:DI 97)
(const_int 255 [0xff])) pr.c:9 50 {*movdi_aarch64}
(nil))
(insn 26 25 48 2 (set (zero_extract:DI (reg:DI 102 [ D.3038 ])
(const_int 32 [0x20])
(const_int 32 [0x20]))
(reg:DI 97)) pr.c:9 691 {*insv_regdi}
(expr_list:REG_DEAD (reg:DI 97)
(nil)))
get replaced to
(insn 26 24 48 2 (set (zero_extract:DI (reg:DI 102 [ D.3038 ])
(const_int 32 [0x20])
(const_int 32 [0x20]))
(reg:DI 94)) pr.c:9 691 {*insv_regdi}
(expr_list:REG_DEAD (reg:DI 97)
(nil)))
because when insn 25 is scanned it the constant 255 is already
mapped to reg 94 at insn 20.
now cse maps:
(reg:DI 97) => (reg:DI 94)
therefore canon_reg (reg:DI 97) returns (reg:DI 94)
which makes canonicalize_insn do the wrong transformation at insn 26.
Now I think I found a better place for a patch, where the first bogus
mapping is recorded:
Index: cse.c
===================================================================
--- cse.c (revision 238891)
+++ cse.c (working copy)
@@ -5898,15 +5898,7 @@ cse_insn (rtx_insn *insn)
|| GET_MODE (dest) == BLKmode
/* If we didn't put a REG_EQUAL value or a source into the hash
table, there is no point is recording DEST. */
- || sets[i].src_elt == 0
- /* If DEST is a paradoxical SUBREG and SRC is a ZERO_EXTEND
- or SIGN_EXTEND, don't record DEST since it can cause
- some tracking to be wrong.
-
- ??? Think about this more later. */
- || (paradoxical_subreg_p (dest)
- && (GET_CODE (sets[i].src) == SIGN_EXTEND
- || GET_CODE (sets[i].src) == ZERO_EXTEND)))
+ || sets[i].src_elt == 0)
continue;
/* STRICT_LOW_PART isn't part of the value BEING set,
@@ -5925,6 +5917,11 @@ cse_insn (rtx_insn *insn)
sets[i].dest_hash = HASH (dest, GET_MODE (dest));
}
+ /* If DEST is a paradoxical SUBREG, don't record DEST since it can
+ cause some tracking to be wrong. */
+ if (paradoxical_subreg_p (dest))
+ continue;
+
elt = insert (dest, sets[i].src_elt,
sets[i].dest_hash, GET_MODE (dest));
So apparently there was already an attempt of a fix for a similar bug,
and svn blame points to:
svn log -v -r8354
------------------------------------------------------------------------
r8354 | kenner | 1994-10-28 23:55:05 +0100 (Fri, 28 Oct 1994) | 3 lines
Changed paths:
M /trunk/gcc/cse.c
(cse_insn): Don't record a DEST a paradoxical SUBREG and SRC is a
SIGN_EXTEND or ZERO_EXTEND.
------------------------------------------------------------------------
This way we can still map the underlying QI register to 255 but
not the SUBREG if it is a paradoxical subreg.
In the test case this patch still works (output code does not change).
What do you think?
Bernd.