Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)

Segher Boessenkool Wed, 14 Jan 2015 07:20:12 -0800

On Tue, Jan 13, 2015 at 11:16:09PM -0700, Jeff Law wrote:
> On 01/13/15 17:03, Segher Boessenkool wrote:
> >On Tue, Jan 13, 2015 at 03:17:08PM -0700, Jeff Law wrote:
> >>>And finally there is the case of non-volatile asm with "memory" clobber
> >>>with
> >>>no memory stores in between the two - the posted (safer) patch will not
> >>>allow to CSE the two, while in theory we could CSE them into just one 
> >>>asm.
> >>I think we have to assume that CSEing them is wrong.  The first may set
> >>something in memory that is read by the second.
> >>
> >>Thoughts?
> >
> >I agree with pretty much everything you say in the thread, except for this
> >idea that a memory clobber reads memory.  No clobber reads anything.
> >
> >The commit that introduced the memory clobber concept, 426b38c9 (svn 1207),
> >by rms, has as only comment
> >
> >     /* `memory', don't cache memory across asm */
> RMS botched this and you can see it in that the scheduler was not 
> updated at the same time.  The scheduler absolutely must track if an ASM 
> does a memory read of an arbitrary location.  I'd have to dig deeper to 
> see when this got fixed, but it was clearly botched.
> 
> 
> Many years later another pass which needs to precisely track such things 
> came along, namely DSE.
> 
> The code in DSE is actually easier to grok.
> 
> 
> First, if you look at the ASM handling in cfgexpand.c you'll find:
> 
>             if (j == -4)      /* `memory', don't cache memory across asm */
>                 {
>                   XVECEXP (body, 0, i++)
>                     = gen_rtx_CLOBBER (VOIDmode,
>                                        gen_rtx_MEM
>                                        (BLKmode,
>                                         gen_rtx_SCRATCH (VOIDmode)));
>                   continue;
>                 }


That is in fact the code I mentioned, the first code that was added :-)
It was moved from stmt.c to cfgexpand.c in 2013.

> So we generate (CLOBBER (MEM:BLK (SCRATCH))) when we see "memory" in the 
> "clobber" list of an ASM.

Right.  And we actually have documentation for that construct (woohoo!),
see rtl.texi "Side Effect Expressions":

"
@findex clobber
@item (clobber @var{x})
Represents the storing or possible storing of an unpredictable,
undescribed value into @var{x}, which must be a @code{reg},
@code{scratch}, @code{parallel} or @code{mem} expression.

[...]

If @var{x} is @code{(mem:BLK (const_int 0))} or
@code{(mem:BLK (scratch))}, it means that all memory
locations must be presumed clobbered.
"

Note it doesn't mention reading memory.


> If you then look at dse.c we have this in record_store:
> 
>  /* At this point we know mem is a mem. */
>   if (GET_MODE (mem) == BLKmode)
>     {
>       if (GET_CODE (XEXP (mem, 0)) == SCRATCH)
>         {
>           if (dump_file && (dump_flags & TDF_DETAILS))
>             fprintf (dump_file, " adding wild read for (clobber 
> (mem:BLK (scratch))\n");
>           add_wild_read (bb_info);
>           insn_info->cannot_delete = true;
>           return 0;
>         }
> 
> 
> Which says very precisely that we treat (CLOBBER (MEM:BLK (SCRATCH))) as 
> potentially *reading* any location.
> 
> If you trace through how the scheduler builds dependencies, paying 
> particular attention to alias.c you'll see that (CLOBBER (MEM:BLK 
> (SCRATCH))) is treated as both a read and a write of an arbitrary location.

Many transforms need to treat it like that, sure, because the clobber
only writes some of the memory.  Since you do not know what memory
is and isn't written, all stores before the clobber have to stay before
the clobber, and that is most easily represented as the insn with the
clobber having a read dependency on all memory.


Now if we go back to my earlier quote:

"
If your assembler instructions access memory in an unpredictable
fashion, add @samp{memory} to the list of clobbered registers.  This
causes GCC to not keep memory values cached in registers across the
assembler instruction and not optimize stores or loads to that memory.
You also should add the @code{volatile} keyword if the memory
affected is not listed in the inputs or outputs of the @code{asm}, as
the @samp{memory} clobber does not count as a side-effect of the
@code{asm}.
"

That last line means the compiler is free to delete a non-volatile
asm with a memory clobber if that asm is not needed for dataflow.  Or
that is how I read it; it is trying to indicate that if you want to
prevent the memory clobber from being deleted (together with the rest
of the asm), you need to make the asm volatile.

So as far as I can see the compiler can CSE two identical non-volatile
asms with memory clobber just fine.  Older GCC (I tried 4.7.2) does do
this; current mainline doesn't.  I think it should.


Segher

Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)

Reply via email to