Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Qing Zhao via Gcc-patches Wed, 23 Sep 2020 07:15:42 -0700

> On Sep 23, 2020, at 6:05 AM, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
> 
> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>>>>> 
>>>>>> The following is what I see from i386.md: (I didn’t look at how 
>>>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet)
>>>>>> 
>>>>>> ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers 
>>>>>> and
>>>>>> ;; all of memory.  This blocks insns from being moved across this point.
>>>>> 
>>>>> Heh, it looks like that comment dates back to 1994. :-)
>>>>> 
>>>>> The comment is no longer correct though.  I wasn't around at the time,
>>>>> but I assume the comment was only locally true even then.
>>>>> 
>>>>> If what the comment said was true, then something like:
>>>>> 
>>>>> (define_insn "cld"
>>>>> [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
>>>>> ""
>>>>> "cld"
>>>>> [(set_attr "length" "1")
>>>>> (set_attr "length_immediate" "0")
>>>>> (set_attr "modrm" "0")])
>>>>> 
>>>>> would invalidate the entire register file and so would require all values
>>>>> to be spilt to the stack around the CLD.
>>>> 
>>>> Okay, thanks for the info. 
>>>> then, what’s the current definition of UNSPEC_VOLATILE? 
>>> 
>>> I'm not sure it's written down anywhere TBH.  rtl.texi just says:
>>> 
>>> @code{unspec_volatile} is used for volatile operations and operations
>>> that may trap; @code{unspec} is used for other operations.
>>> 
>>> which seems like a cyclic definition: volatile expressions are defined
>>> to be expressions that are volatile.
>>> 
>>> But IMO the semantics are that unspec_volatile patterns with a given
>>> set of inputs and outputs act for dataflow purposes like volatile asms
>>> with the same inputs and outputs.  The semantics of asm volatile are
>>> at least slightly more well-defined (if only by example); see extend.texi
>>> for details.  In particular:
>>> 
>>> Note that the compiler can move even @code{volatile asm} instructions 
>>> relative
>>> to other code, including across jump instructions. For example, on many 
>>> targets there is a system register that controls the rounding mode of 
>>> floating-point operations. Setting it with a @code{volatile asm} statement,
>>> as in the following PowerPC example, does not work reliably.
>>> 
>>> @example
>>> asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>>> sum = x + y;
>>> @end example
>>> 
>>> The compiler may move the addition back before the @code{volatile asm}
>>> statement. To make it work as expected, add an artificial dependency to
>>> the @code{asm} by referencing a variable in the subsequent code, for
>>> example:
>>> 
>>> @example
>>> asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>>> sum = x + y;
>>> @end example
>>> 
>>> which is very similar to the unspec_volatile case we're talking about.
>>> 
>>> To take an x86 example:
>>> 
>>> void
>>> f (char *x)
>>> {
>>>   asm volatile ("");
>>>   x[0] = 0;
>>>   asm volatile ("");
>>>   x[1] = 0;
>>>   asm volatile ("");
>>> }
>> 
>> If we change the above as the following: (but it might not correct on the 
>> asm format):
>> 
>> Void
>> F (char *x)
>> {
>> asm volatile (“x[0]”);
>> x[0] = 0;
>> asm volatile (“x[1]"); 
>> x[1] = 0;
>> asm volatile ("”);
>> }
>> 
>> Will the moving and merging be blocked?
> 
> That would stop assignments moving up, but it wouldn't stop x[0] moving
> down across the x[1] asm.  Using:
> 
>  asm volatile ("" ::: "memory");
> 
> would prevent moves in both directions, which was what I meant in my
> later comment about memory clobbers.
> 
> In each case, the same would be true for unspec_volatile.

So, is the following good enough:

asm volatile (reg1, reg2, … regN, memory)
mov reg1, 0
mov reg2, 0
...
mov regN,0
asm volatile (reg1, reg2,… regN, memory)
return


I.e, just add one “asm volatile” insn whose operands include all registers and 
memory BEFORE and AFTER the whole zeroing sequence.

Or, we have to add one “asm volatile” insn before and after each “mov” insn? 


> 
>> I found the following code in df-scan.c:
>> 
>> static void
>> df_uses_record (class df_collection_rec *collection_rec,
>>                rtx *loc, enum df_ref_type ref_type,
>>                basic_block bb, struct df_insn_info *insn_info,
>>                int flags)
>> {
>> …
>> 
>>    case ASM_OPERANDS:
>>    case UNSPEC_VOLATILE:
>>    case TRAP_IF:
>>    case ASM_INPUT:
>> …
>>        if (code == ASM_OPERANDS)
>>          {
>>            int j;
>> 
>>            for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
>>              df_uses_record (collection_rec, &ASM_OPERANDS_INPUT (x, j),
>>                              DF_REF_REG_USE, bb, insn_info, flags);
>>            return;
>>          }
>>        break;
>> …
>> }
>> 
>> 
>> Looks like ONLY the operands of  “ASM_OPERANDS” are recorded as USES in df 
>> analysis,  the operands of “UNSPEC_VOLATILE” are NOT. 
> 
> The recursion code after the switch statement handles the operands of
> unspec_volatile.

Okay, I see. 
So, these two are actually equal to each other. 

> 
>> If we use “ASM_OPERANDS” instead of “UNSPEXC_VOLATILE” as you suggested, the 
>> data flow analysis should automatically pick up the operands of 
>> “ASM_OPERANDS”, and fix the data flow, right?
> 
> Using a volatile asm or an unspec_volatile would be equally correct.
> The reason for preferring a volatile asm is that it doesn't require
> target-specific .md patterns.
Okay.

Then is there any benefit to use “UNSPEC_volatile” over “volatile asm”?
> 
> Of course, as mentioned before, “correct” in this case is: make a good
> but not foolproof attempt at trying to prevent later passes from moving
> the zeroing instructions further away from the return instruction
> (or, equivalently, moving other instructions closer to the return
> instruction).  Remember that we arrived here from a discussion about
> whether the volatile insns would be enough to prevent machine_reorg and
> other passes from moving instructions around (modulo bugs in those passes).
> My position was that the volatile insns would help, but that we might
> still find cases where a machine_reorg makes a behaviourally-correct
> transformation that we don't want.
So, you mean after adding “volatile asm” or “UNSPEC_volatile”,  although 
most of the insn movement can be prevented, there might still be small 
possibitly 
Some unwanted transformation might happen?

> 
>>> Passes like fwprop.c, postreload-gcse.c and ree.c do not (AFAIK) worry
>>> about volatile asms or unspec_volatiles, and can move code across them.
>>> And that's kind-of inevitable.  Having an “everything barrier” makes
>>> life very hard for global optimisation.
>> 
>> Okay, so, it’s intentionally not making UNSPEC_VOLATILE as an “everything 
>> barrier”? 
> 
> Yeah.
> 
>> (But I do feel that the design for UNSPEC_volatile is not clean)
> 
> Agreed.  But I think that's partly because what it's trying to achieve
> isn't clean either.  It's a catch-all for “something is happening,
> but we're not saying what”.  And not saying what is itself unclean. ;-)

thanks.

Qing
> 
> Thanks,
> Richard
Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to