On 01/13/15 09:18, Jakub Jelinek wrote:
Hi!
My PR60663 fix unfortunately stopped CSE of all inline-asms, even when
they e.g. only have the clobbers added by default.
This patch attempts to restore the old behavior, with the exceptions:
1) as always, asm volatile is not CSEd
2) inline-asm with multiple outputs are not CSEd
3) on request from Richard (which Segher on IRC argues against), "memory"
clobber also prevents CSE; this can be removed by removing the
int j, lim = XVECLEN (x, 0); and loop below it
4) inline-asm with clobbers is never copied into an insn that wasn't
inline-asm before, so if there are clobbers, we allow CSEing of
e.g. two same inline-asms, but only by reusing results of one
of those
Bootstrapped/regtested on x86_64-linux and i686-linux, tested also
with arm cross after reverting the PR60663 arm cost fix.
Ok for trunk this way, or with 3) removed?
2015-01-13 Jakub Jelinek <ja...@redhat.com>
PR rtl-optimization/63637
PR rtl-optimization/60663
* cse.c (merge_equiv_classes): Set new_elt->cost to MAX_COST
if elt->cost is MAX_COST for ASM_OPERANDS.
(find_sets_in_insn): Fix up comment typo.
(cse_insn): Don't set src_volatile for all non-volatile
ASM_OPERANDS in PARALLELs, but just those with multiple outputs
or with "memory" clobber. Set elt->cost to MAX_COST
for ASM_OPERANDS in PARALLEL. Set src_elt->cost to MAX_COST
if new_src is ASM_OPERANDS and elt->cost is MAX_COST.
* gcc.dg/pr63637-1.c: New test.
* gcc.dg/pr63637-2.c: New test.
* gcc.dg/pr63637-3.c: New test.
* gcc.dg/pr63637-4.c: New test.
* gcc.dg/pr63637-5.c: New test.
* gcc.dg/pr63637-6.c: New test.
* gcc.target/i386/pr63637-1.c: New test.
* gcc.target/i386/pr63637-2.c: New test.
* gcc.target/i386/pr63637-3.c: New test.
* gcc.target/i386/pr63637-4.c: New test.
* gcc.target/i386/pr63637-5.c: New test.
* gcc.target/i386/pr63637-6.c: New test.
--- gcc/cse.c.jj 2015-01-09 21:59:44.000000000 +0100
+++ gcc/cse.c 2015-01-13 13:26:23.391216064 +0100
@@ -1792,6 +1792,8 @@ merge_equiv_classes (struct table_elt *c
}
new_elt = insert (exp, class1, hash, mode);
new_elt->in_memory = hash_arg_in_memory;
+ if (GET_CODE (exp) == ASM_OPERANDS && elt->cost == MAX_COST)
+ new_elt->cost = MAX_COST;
}
}
}
@@ -4258,7 +4260,7 @@ find_sets_in_insn (rtx_insn *insn, struc
{
int i, lim = XVECLEN (x, 0);
- /* Go over the epressions of the PARALLEL in forward order, to
+ /* Go over the expressions of the PARALLEL in forward order, to
put them in the same order in the SETS array. */
for (i = 0; i < lim; i++)
{
@@ -4634,12 +4636,27 @@ cse_insn (rtx_insn *insn)
&& REGNO (dest) >= FIRST_PSEUDO_REGISTER)
sets[i].src_volatile = 1;
- /* Also do not record result of a non-volatile inline asm with
- more than one result or with clobbers, we do not want CSE to
- break the inline asm apart. */
else if (GET_CODE (src) == ASM_OPERANDS
&& GET_CODE (x) == PARALLEL)
- sets[i].src_volatile = 1;
+ {
+ /* Do not record result of a non-volatile inline asm with
+ more than one result. */
+ if (n_sets > 1)
+ sets[i].src_volatile = 1;
+
+ int j, lim = XVECLEN (x, 0);
+ for (j = 0; j < lim; j++)
+ {
+ rtx y = XVECEXP (x, 0, j);
+ /* And do not record result of a non-volatile inline asm
+ with "memory" clobber. */
+ if (GET_CODE (y) == CLOBBER && MEM_P (XEXP (y, 0)))
Can you please add a comment here which references the full form of the
"memory" tag. (clobber (mem:BLK (scratch))).
If we ever have to look at this again (say perhaps to break out the read
anything vs write anything into separate tags :-) it'll save
considerable time and angst trying to track all this stuff down.
The tests you've got are a step forward, but there's obviously a lot
more we could do. For example testing DSE around ASMs without and
without a memory "clobber", testing CSE of unrelated memory references
around an ASM without and without a memory clobber come to mind. You
don't have to add them to get approval, but if you were to take the time
to cobble them together it'd be hugely appreciated.
Given the discussion with Segher, let's give him a chance to chime in on
tonight's messages before we make a final decision.
jeff