Richard Sandiford wrote:
> Georg-Johann Lay <[email protected]> writes:
>> This patch fixes PR51374 by more strictly updating mem_last_set.
>> Sloppy handling of mem_last_set can lead to error in volatile correctness
>> because combine then is allowed to drag one volatile access over an other
>> volatile thing (volatile access, asm volatile, unspec_volatile...).
>>
>> As explained in the source comment, any volatile might change memory, even a
>> volatile read.
>
> This looks too broad to me. Volatile MEMs don't usually alias
> non-volatile ones (so for example, reads from volatile memory shouldn't
> affect reads from non-volatile memory).
The patch tried to fix the issue. If there is less invasive solutions, that's
always preferred, of course.
> What do you think about instead changing:
>
> /* If there are any volatile insns between INSN and I3, reject, because
> they might affect machine state. */
>
> for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
> if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN
> (p)))
> return 0;
>
> to:
>
> /* If INSN contains volatile references (specifically volatile MEMs),
> we cannot combine across any other volatile references.
> Even if INSN doesn't contain volatile references, any intervening
> volatile insn might affect machine state. */
>
> pred = volatile_refs_p (insn) ? volatile_refs_p : volatile_insn_p;
> for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
> if (NONDEBUG_INSN_P (p) && p != succ && p != succ2 && pred (PATTERN (p)))
> return 0;
This would still allow a volatile_refs_p to cross a volatile_insn_p?
And some lines above there is this:
/* If INSN contains anything volatile, or is an `asm' (whether volatile
or not), reject, unless nothing volatile comes between it and I3 */
if (GET_CODE (src) == ASM_OPERANDS || volatile_refs_p (src))
{
/* Make sure neither succ nor succ2 contains a volatile reference. */
if (succ2 != 0 && volatile_refs_p (PATTERN (succ2)))
return 0;
if (succ != 0 && volatile_refs_p (PATTERN (succ)))
return 0;
/* We'll check insns between INSN and I3 below. */
}
Would our change (whatever it will look like) make that code superfluous?
> As you say in the PR notes, we've already handled volatile_insn_p (insn)
> by this stage, so the new loop is handling:
>
> volatile_refs_p (insn) && !volatile_insn_p (insn)
There is this part of an if-clause in can_combine_p:
/* Make sure that the value that is to be substituted for the register
does not use any registers whose values alter in between. However,
If the insns are adjacent, a use can't cross a set even though we
think it might (this can happen for a sequence of insns each setting
the same destination; last_set of that register might point to
a NOTE). If INSN has a REG_EQUIV note, the register is always
equivalent to the memory so the substitution is valid even if there
are intervening stores. Also, don't move a volatile asm or
UNSPEC_VOLATILE across any other insns. */
|| (! all_adjacent
&& (((!MEM_P (src)
|| ! find_reg_note (insn, REG_EQUIV, src))
&& use_crosses_set_p (src, DF_INSN_LUID (insn)))
|| (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src))
|| GET_CODE (src) == UNSPEC_VOLATILE))
so it's not a verbatim volatile_insn_p. Maybe it's for historical reason or for
some subtle differences not explained; only someone familiar with combine will
know... perhaps performance issue.
>> Moreover, writes of the shape (set (zero_extract (mem ... update
>> mem_last_set.
>
> Good catch. I think we should use note_stores for added safety though,
> rather than call zero_extract out as a special case. (I realise the
> function already handles subregs, but still.)
record_dead_and_set_regs_1 is called through note_stores so that (set
(zero_extract (mem)) is handled already and thus no change to
record_dead_and_set_regs_1 is needed, right?
Attached you find a new, tentative patch. It resolves the issue in my small
test program.
However, I think someone with more insight into combine should take over the
patch.
Johann
PR rtl-optimization/51374
* combine.c (can_combine_p): Don't allow volatile_refs_p insns
to cross other volatile_refs_p insns.
Index: combine.c
===================================================================
--- combine.c (revision 183695)
+++ combine.c (working copy)
@@ -1947,12 +1947,21 @@ can_combine_p (rtx insn, rtx i3, rtx pre
&& REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER)
return 0;
- /* If there are any volatile insns between INSN and I3, reject, because
- they might affect machine state. */
+ /* If INSN contains volatile references (specifically volatile MEMs),
+ we cannot combine across any other volatile references.
+ Even if INSN doesn't contain volatile references, any intervening
+ volatile insn might affect machine state. */
+ {
+ int (*predicat) (const_rtx);
- for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
- if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p)))
- return 0;
+ predicat = volatile_refs_p (PATTERN (insn))
+ ? volatile_refs_p
+ : volatile_insn_p;
+
+ for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
+ if (INSN_P (p) && p != succ && p != succ2 && predicat (PATTERN (p)))
+ return 0;
+ }
/* If INSN contains an autoincrement or autodecrement, make sure that
register is not used between there and I3, and not already used in