Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Qing Zhao via Gcc-patches Wed, 23 Sep 2020 07:49:52 -0700

> On Sep 23, 2020, at 9:32 AM, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
> 
> Qing Zhao <qing.z...@oracle.com> writes:
>>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>> 
>>> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford 
>>>>> <richard.sandif...@arm.com> wrote:
>>>>>>>> 
>>>>>>>> The following is what I see from i386.md: (I didn’t look at how 
>>>>>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet)
>>>>>>>> 
>>>>>>>> ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers 
>>>>>>>> and
>>>>>>>> ;; all of memory.  This blocks insns from being moved across this 
>>>>>>>> point.
>>>>>>> 
>>>>>>> Heh, it looks like that comment dates back to 1994. :-)
>>>>>>> 
>>>>>>> The comment is no longer correct though.  I wasn't around at the time,
>>>>>>> but I assume the comment was only locally true even then.
>>>>>>> 
>>>>>>> If what the comment said was true, then something like:
>>>>>>> 
>>>>>>> (define_insn "cld"
>>>>>>> [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
>>>>>>> ""
>>>>>>> "cld"
>>>>>>> [(set_attr "length" "1")
>>>>>>> (set_attr "length_immediate" "0")
>>>>>>> (set_attr "modrm" "0")])
>>>>>>> 
>>>>>>> would invalidate the entire register file and so would require all 
>>>>>>> values
>>>>>>> to be spilt to the stack around the CLD.
>>>>>> 
>>>>>> Okay, thanks for the info. 
>>>>>> then, what’s the current definition of UNSPEC_VOLATILE? 
>>>>> 
>>>>> I'm not sure it's written down anywhere TBH.  rtl.texi just says:
>>>>> 
>>>>> @code{unspec_volatile} is used for volatile operations and operations
>>>>> that may trap; @code{unspec} is used for other operations.
>>>>> 
>>>>> which seems like a cyclic definition: volatile expressions are defined
>>>>> to be expressions that are volatile.
>>>>> 
>>>>> But IMO the semantics are that unspec_volatile patterns with a given
>>>>> set of inputs and outputs act for dataflow purposes like volatile asms
>>>>> with the same inputs and outputs.  The semantics of asm volatile are
>>>>> at least slightly more well-defined (if only by example); see extend.texi
>>>>> for details.  In particular:
>>>>> 
>>>>> Note that the compiler can move even @code{volatile asm} instructions 
>>>>> relative
>>>>> to other code, including across jump instructions. For example, on many 
>>>>> targets there is a system register that controls the rounding mode of 
>>>>> floating-point operations. Setting it with a @code{volatile asm} 
>>>>> statement,
>>>>> as in the following PowerPC example, does not work reliably.
>>>>> 
>>>>> @example
>>>>> asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>>>>> sum = x + y;
>>>>> @end example
>>>>> 
>>>>> The compiler may move the addition back before the @code{volatile asm}
>>>>> statement. To make it work as expected, add an artificial dependency to
>>>>> the @code{asm} by referencing a variable in the subsequent code, for
>>>>> example:
>>>>> 
>>>>> @example
>>>>> asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>>>>> sum = x + y;
>>>>> @end example
>>>>> 
>>>>> which is very similar to the unspec_volatile case we're talking about.
>>>>> 
>>>>> To take an x86 example:
>>>>> 
>>>>> void
>>>>> f (char *x)
>>>>> {
>>>>>  asm volatile ("");
>>>>>  x[0] = 0;
>>>>>  asm volatile ("");
>>>>>  x[1] = 0;
>>>>>  asm volatile ("");
>>>>> }
>>>> 
>>>> If we change the above as the following: (but it might not correct on the 
>>>> asm format):
>>>> 
>>>> Void
>>>> F (char *x)
>>>> {
>>>> asm volatile (“x[0]”);
>>>> x[0] = 0;
>>>> asm volatile (“x[1]"); 
>>>> x[1] = 0;
>>>> asm volatile ("”);
>>>> }
>>>> 
>>>> Will the moving and merging be blocked?
>>> 
>>> That would stop assignments moving up, but it wouldn't stop x[0] moving
>>> down across the x[1] asm.  Using:
>>> 
>>> asm volatile ("" ::: "memory");
>>> 
>>> would prevent moves in both directions, which was what I meant in my
>>> later comment about memory clobbers.
>>> 
>>> In each case, the same would be true for unspec_volatile.
>> 
>> So, is the following good enough:
>> 
>> asm volatile (reg1, reg2, … regN, memory)
>> mov reg1, 0
>> mov reg2, 0
>> ...
>> mov regN,0
>> asm volatile (reg1, reg2,… regN, memory)
>> return
>> 
>> 
>> I.e, just add one “asm volatile” insn whose operands include all registers 
>> and memory BEFORE and AFTER the whole zeroing sequence.
> 
> It isn't clear from your syntax whether the asm volatile arguments
> are uses or clobbers.

How can the syntax of asm volatile distinguish “Uses” and “Clobbers”? 

>  The idea was:
> 
> - There would be an asm volatile before the moves that clobbers (but does
>  not use) (mem:BLK (scratch)) and the zeroed registers.
> 
> - EPILOGUE_USES would make the zeroed registers live after the return.

Is EPILOGUE_USES the only way for this purpose? Will add another “asm volatile” 
immediately before the return serve the same purpose?


> 
>> Or, we have to add one “asm volatile” insn before and after each “mov” insn? 
> 
> No, the idea with the multiple clobber thing was to have a single asm.
Okay.
> 
>>>> If we use “ASM_OPERANDS” instead of “UNSPEXC_VOLATILE” as you suggested, 
>>>> the data flow analysis should automatically pick up the operands of 
>>>> “ASM_OPERANDS”, and fix the data flow, right?
>>> 
>>> Using a volatile asm or an unspec_volatile would be equally correct.
>>> The reason for preferring a volatile asm is that it doesn't require
>>> target-specific .md patterns.
>> Okay.
>> 
>> Then is there any benefit to use “UNSPEC_volatile” over “volatile asm”?
> 
> In general, yes: you can use the full .md functionality with
> unspec_volatiles, such as splitting insns, adding match_scratches
> with different clobber requirements, writing custom output code,
> setting attributes, etc.
> 
> But there isn't an advantage to using unspec_volatile in this case,
> where the instruction doesn't actually do anything.

Okay, I see. 

> 
>>> Of course, as mentioned before, “correct” in this case is: make a good
>>> but not foolproof attempt at trying to prevent later passes from moving
>>> the zeroing instructions further away from the return instruction
>>> (or, equivalently, moving other instructions closer to the return
>>> instruction).  Remember that we arrived here from a discussion about
>>> whether the volatile insns would be enough to prevent machine_reorg and
>>> other passes from moving instructions around (modulo bugs in those passes).
>>> My position was that the volatile insns would help, but that we might
>>> still find cases where a machine_reorg makes a behaviourally-correct
>>> transformation that we don't want.
>> So, you mean after adding “volatile asm” or “UNSPEC_volatile”,  although 
>> most of the insn movement can be prevented, there might still be small 
>> possibitly 
>> Some unwanted transformation might happen?
> 
> I wouldn't want to quantify the possibility.  The point is just that the
> possibility exists.  The unspec_volatile does not prevent movement of
> unrelated non-volatile operations.

Okay. 

thanks.

Qing
> 
> Thanks,
> Richard
Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to