Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Qing Zhao via Gcc-patches Wed, 23 Sep 2020 09:09:09 -0700

> On Sep 23, 2020, at 10:21 AM, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
> 
> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>> On Sep 23, 2020, at 9:32 AM, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>> 
>>> Qing Zhao <qing.z...@oracle.com> writes:
>>>>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford 
>>>>> <richard.sandif...@arm.com> wrote:
>>>>> 
>>>>> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>>>>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford 
>>>>>>> <richard.sandif...@arm.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> The following is what I see from i386.md: (I didn’t look at how 
>>>>>>>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet)
>>>>>>>>>> 
>>>>>>>>>> ;; UNSPEC_VOLATILE is considered to use and clobber all hard 
>>>>>>>>>> registers and
>>>>>>>>>> ;; all of memory.  This blocks insns from being moved across this 
>>>>>>>>>> point.
>>>>>>>>> 
>>>>>>>>> Heh, it looks like that comment dates back to 1994. :-)
>>>>>>>>> 
>>>>>>>>> The comment is no longer correct though.  I wasn't around at the time,
>>>>>>>>> but I assume the comment was only locally true even then.
>>>>>>>>> 
>>>>>>>>> If what the comment said was true, then something like:
>>>>>>>>> 
>>>>>>>>> (define_insn "cld"
>>>>>>>>> [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
>>>>>>>>> ""
>>>>>>>>> "cld"
>>>>>>>>> [(set_attr "length" "1")
>>>>>>>>> (set_attr "length_immediate" "0")
>>>>>>>>> (set_attr "modrm" "0")])
>>>>>>>>> 
>>>>>>>>> would invalidate the entire register file and so would require all 
>>>>>>>>> values
>>>>>>>>> to be spilt to the stack around the CLD.
>>>>>>>> 
>>>>>>>> Okay, thanks for the info. 
>>>>>>>> then, what’s the current definition of UNSPEC_VOLATILE? 
>>>>>>> 
>>>>>>> I'm not sure it's written down anywhere TBH.  rtl.texi just says:
>>>>>>> 
>>>>>>> @code{unspec_volatile} is used for volatile operations and operations
>>>>>>> that may trap; @code{unspec} is used for other operations.
>>>>>>> 
>>>>>>> which seems like a cyclic definition: volatile expressions are defined
>>>>>>> to be expressions that are volatile.
>>>>>>> 
>>>>>>> But IMO the semantics are that unspec_volatile patterns with a given
>>>>>>> set of inputs and outputs act for dataflow purposes like volatile asms
>>>>>>> with the same inputs and outputs.  The semantics of asm volatile are
>>>>>>> at least slightly more well-defined (if only by example); see 
>>>>>>> extend.texi
>>>>>>> for details.  In particular:
>>>>>>> 
>>>>>>> Note that the compiler can move even @code{volatile asm} instructions 
>>>>>>> relative
>>>>>>> to other code, including across jump instructions. For example, on many 
>>>>>>> targets there is a system register that controls the rounding mode of 
>>>>>>> floating-point operations. Setting it with a @code{volatile asm} 
>>>>>>> statement,
>>>>>>> as in the following PowerPC example, does not work reliably.
>>>>>>> 
>>>>>>> @example
>>>>>>> asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>>>>>>> sum = x + y;
>>>>>>> @end example
>>>>>>> 
>>>>>>> The compiler may move the addition back before the @code{volatile asm}
>>>>>>> statement. To make it work as expected, add an artificial dependency to
>>>>>>> the @code{asm} by referencing a variable in the subsequent code, for
>>>>>>> example:
>>>>>>> 
>>>>>>> @example
>>>>>>> asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>>>>>>> sum = x + y;
>>>>>>> @end example
>>>>>>> 
>>>>>>> which is very similar to the unspec_volatile case we're talking about.
>>>>>>> 
>>>>>>> To take an x86 example:
>>>>>>> 
>>>>>>> void
>>>>>>> f (char *x)
>>>>>>> {
>>>>>>> asm volatile ("");
>>>>>>> x[0] = 0;
>>>>>>> asm volatile ("");
>>>>>>> x[1] = 0;
>>>>>>> asm volatile ("");
>>>>>>> }
>>>>>> 
>>>>>> If we change the above as the following: (but it might not correct on 
>>>>>> the asm format):
>>>>>> 
>>>>>> Void
>>>>>> F (char *x)
>>>>>> {
>>>>>> asm volatile (“x[0]”);
>>>>>> x[0] = 0;
>>>>>> asm volatile (“x[1]"); 
>>>>>> x[1] = 0;
>>>>>> asm volatile ("”);
>>>>>> }
>>>>>> 
>>>>>> Will the moving and merging be blocked?
>>>>> 
>>>>> That would stop assignments moving up, but it wouldn't stop x[0] moving
>>>>> down across the x[1] asm.  Using:
>>>>> 
>>>>> asm volatile ("" ::: "memory");
>>>>> 
>>>>> would prevent moves in both directions, which was what I meant in my
>>>>> later comment about memory clobbers.
>>>>> 
>>>>> In each case, the same would be true for unspec_volatile.
>>>> 
>>>> So, is the following good enough:
>>>> 
>>>> asm volatile (reg1, reg2, … regN, memory)
>>>> mov reg1, 0
>>>> mov reg2, 0
>>>> ...
>>>> mov regN,0
>>>> asm volatile (reg1, reg2,… regN, memory)
>>>> return
>>>> 
>>>> 
>>>> I.e, just add one “asm volatile” insn whose operands include all registers 
>>>> and memory BEFORE and AFTER the whole zeroing sequence.
>>> 
>>> It isn't clear from your syntax whether the asm volatile arguments
>>> are uses or clobbers.
>> 
>> How can the syntax of asm volatile distinguish “Uses” and “Clobbers”? 
> 
> Well, I wasn't trying to discuss correct syntax, I just wasn't sure what
> you meant.
> 
> As mentioned in the quote below, I was expecting the asm volatile
> before the zeroing to include clobbers generated as discussed in
> the earlier message:
> 
>  rtx asm_op = gen_rtx_ASM_OPERANDS (…);
>  MEM_VOLATILE_P (asm_op) = 1;
> 
>  rtvec v = rtvec_alloc (N + 1);
>  RTVEC_ELT (v, 0) = asm_op;
>  RTVEC_ELT (v, 1) = gen_rtx_CLOBBER (VOIDmode, …);
>  …
>  RTVEC_ELT (v, N) = gen_rtx_CLOBBER (VOIDmode, …);
> 
>  emit_insn (gen_rtx_PARALLEL (VOIDmode, v));
> 
> But doing this after the zeroing would give:
> 
>  …clobber reg1 in an asm…
>  …set reg1 to zero…
>  …clobber reg1 in an asm…
> 
> Dataflow-wise, the second clobber overwrites the effect of the zeroing.
> Since nothing uses reg1 between the zeroing and the clobber, the zeroing
> could be removed as dead.

Okay, I see.
Thanks for the explanation.

> 
>>> The idea was:
>>> 
>>> - There would be an asm volatile before the moves that clobbers (but does
>>> not use) (mem:BLK (scratch)) and the zeroed registers.
>>> 
>>> - EPILOGUE_USES would make the zeroed registers live after the return.
>> 
>> Is EPILOGUE_USES the only way for this purpose? Will add another “asm 
>> volatile” immediately before the return serve the same purpose?
> 
> Why do you want to use an asm to keep the instructions live though?

Just want to avoid changing of “EPILOGUE_USES” and make the implementation 
simpler… -:)
But I might be wrong here.

> 
> As I think I mentioned before (but sorry if I'm misremembering),
> using an asm would be counterproductive on delayed-branch targets.
> The delayed branch scheduler looks backwards for something that could
> fill the delay slot.  If we have an asm after the zeroing instructions
> that uses the zeroed registers, that would prevent any zeroing
> instruction from filling the delay slot.  The delayed branch scheduler
> would therefore try to fill the delay slot with something from before
> the zeroing sequence, which is exactly what we'd like to avoid.
> 
> Also, using an asm after the sequence would allow a machine_reorg
> pass to reuse the zeroed registers for something else between the
> second asm and the return.
> 
> IMO, marking the zeroed registers as being live out of the function
> is the simplest, most direct way of representing the fact that the
> zeroing effect has to survive to the function return.  It's how we
> make sure that the function return value remains live and how we make
> sure that the restored call-preserved registers remain live.

Okay, now I understand.

Thanks a lot for your patience. 

Qing
> 
> Thanks,
> Richard
Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to