> On Sep 23, 2020, at 10:21 AM, Richard Sandiford <richard.sandif...@arm.com>
> wrote:
>
> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>> On Sep 23, 2020, at 9:32 AM, Richard Sandiford <richard.sandif...@arm.com>
>>> wrote:
>>>
>>> Qing Zhao <qing.z...@oracle.com> writes:
>>>>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford
>>>>> <richard.sandif...@arm.com> wrote:
>>>>>
>>>>> Qing Zhao <qing.z...@oracle.com <mailto:qing.z...@oracle.com>> writes:
>>>>>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford
>>>>>>> <richard.sandif...@arm.com> wrote:
>>>>>>>>>>
>>>>>>>>>> The following is what I see from i386.md: (I didn’t look at how
>>>>>>>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet)
>>>>>>>>>>
>>>>>>>>>> ;; UNSPEC_VOLATILE is considered to use and clobber all hard
>>>>>>>>>> registers and
>>>>>>>>>> ;; all of memory. This blocks insns from being moved across this
>>>>>>>>>> point.
>>>>>>>>>
>>>>>>>>> Heh, it looks like that comment dates back to 1994. :-)
>>>>>>>>>
>>>>>>>>> The comment is no longer correct though. I wasn't around at the time,
>>>>>>>>> but I assume the comment was only locally true even then.
>>>>>>>>>
>>>>>>>>> If what the comment said was true, then something like:
>>>>>>>>>
>>>>>>>>> (define_insn "cld"
>>>>>>>>> [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
>>>>>>>>> ""
>>>>>>>>> "cld"
>>>>>>>>> [(set_attr "length" "1")
>>>>>>>>> (set_attr "length_immediate" "0")
>>>>>>>>> (set_attr "modrm" "0")])
>>>>>>>>>
>>>>>>>>> would invalidate the entire register file and so would require all
>>>>>>>>> values
>>>>>>>>> to be spilt to the stack around the CLD.
>>>>>>>>
>>>>>>>> Okay, thanks for the info.
>>>>>>>> then, what’s the current definition of UNSPEC_VOLATILE?
>>>>>>>
>>>>>>> I'm not sure it's written down anywhere TBH. rtl.texi just says:
>>>>>>>
>>>>>>> @code{unspec_volatile} is used for volatile operations and operations
>>>>>>> that may trap; @code{unspec} is used for other operations.
>>>>>>>
>>>>>>> which seems like a cyclic definition: volatile expressions are defined
>>>>>>> to be expressions that are volatile.
>>>>>>>
>>>>>>> But IMO the semantics are that unspec_volatile patterns with a given
>>>>>>> set of inputs and outputs act for dataflow purposes like volatile asms
>>>>>>> with the same inputs and outputs. The semantics of asm volatile are
>>>>>>> at least slightly more well-defined (if only by example); see
>>>>>>> extend.texi
>>>>>>> for details. In particular:
>>>>>>>
>>>>>>> Note that the compiler can move even @code{volatile asm} instructions
>>>>>>> relative
>>>>>>> to other code, including across jump instructions. For example, on many
>>>>>>> targets there is a system register that controls the rounding mode of
>>>>>>> floating-point operations. Setting it with a @code{volatile asm}
>>>>>>> statement,
>>>>>>> as in the following PowerPC example, does not work reliably.
>>>>>>>
>>>>>>> @example
>>>>>>> asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>>>>>>> sum = x + y;
>>>>>>> @end example
>>>>>>>
>>>>>>> The compiler may move the addition back before the @code{volatile asm}
>>>>>>> statement. To make it work as expected, add an artificial dependency to
>>>>>>> the @code{asm} by referencing a variable in the subsequent code, for
>>>>>>> example:
>>>>>>>
>>>>>>> @example
>>>>>>> asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>>>>>>> sum = x + y;
>>>>>>> @end example
>>>>>>>
>>>>>>> which is very similar to the unspec_volatile case we're talking about.
>>>>>>>
>>>>>>> To take an x86 example:
>>>>>>>
>>>>>>> void
>>>>>>> f (char *x)
>>>>>>> {
>>>>>>> asm volatile ("");
>>>>>>> x[0] = 0;
>>>>>>> asm volatile ("");
>>>>>>> x[1] = 0;
>>>>>>> asm volatile ("");
>>>>>>> }
>>>>>>
>>>>>> If we change the above as the following: (but it might not correct on
>>>>>> the asm format):
>>>>>>
>>>>>> Void
>>>>>> F (char *x)
>>>>>> {
>>>>>> asm volatile (“x[0]”);
>>>>>> x[0] = 0;
>>>>>> asm volatile (“x[1]");
>>>>>> x[1] = 0;
>>>>>> asm volatile ("”);
>>>>>> }
>>>>>>
>>>>>> Will the moving and merging be blocked?
>>>>>
>>>>> That would stop assignments moving up, but it wouldn't stop x[0] moving
>>>>> down across the x[1] asm. Using:
>>>>>
>>>>> asm volatile ("" ::: "memory");
>>>>>
>>>>> would prevent moves in both directions, which was what I meant in my
>>>>> later comment about memory clobbers.
>>>>>
>>>>> In each case, the same would be true for unspec_volatile.
>>>>
>>>> So, is the following good enough:
>>>>
>>>> asm volatile (reg1, reg2, … regN, memory)
>>>> mov reg1, 0
>>>> mov reg2, 0
>>>> ...
>>>> mov regN,0
>>>> asm volatile (reg1, reg2,… regN, memory)
>>>> return
>>>>
>>>>
>>>> I.e, just add one “asm volatile” insn whose operands include all registers
>>>> and memory BEFORE and AFTER the whole zeroing sequence.
>>>
>>> It isn't clear from your syntax whether the asm volatile arguments
>>> are uses or clobbers.
>>
>> How can the syntax of asm volatile distinguish “Uses” and “Clobbers”?
>
> Well, I wasn't trying to discuss correct syntax, I just wasn't sure what
> you meant.
>
> As mentioned in the quote below, I was expecting the asm volatile
> before the zeroing to include clobbers generated as discussed in
> the earlier message:
>
> rtx asm_op = gen_rtx_ASM_OPERANDS (…);
> MEM_VOLATILE_P (asm_op) = 1;
>
> rtvec v = rtvec_alloc (N + 1);
> RTVEC_ELT (v, 0) = asm_op;
> RTVEC_ELT (v, 1) = gen_rtx_CLOBBER (VOIDmode, …);
> …
> RTVEC_ELT (v, N) = gen_rtx_CLOBBER (VOIDmode, …);
>
> emit_insn (gen_rtx_PARALLEL (VOIDmode, v));
>
> But doing this after the zeroing would give:
>
> …clobber reg1 in an asm…
> …set reg1 to zero…
> …clobber reg1 in an asm…
>
> Dataflow-wise, the second clobber overwrites the effect of the zeroing.
> Since nothing uses reg1 between the zeroing and the clobber, the zeroing
> could be removed as dead.
Okay, I see.
Thanks for the explanation.
>
>>> The idea was:
>>>
>>> - There would be an asm volatile before the moves that clobbers (but does
>>> not use) (mem:BLK (scratch)) and the zeroed registers.
>>>
>>> - EPILOGUE_USES would make the zeroed registers live after the return.
>>
>> Is EPILOGUE_USES the only way for this purpose? Will add another “asm
>> volatile” immediately before the return serve the same purpose?
>
> Why do you want to use an asm to keep the instructions live though?
Just want to avoid changing of “EPILOGUE_USES” and make the implementation
simpler… -:)
But I might be wrong here.
>
> As I think I mentioned before (but sorry if I'm misremembering),
> using an asm would be counterproductive on delayed-branch targets.
> The delayed branch scheduler looks backwards for something that could
> fill the delay slot. If we have an asm after the zeroing instructions
> that uses the zeroed registers, that would prevent any zeroing
> instruction from filling the delay slot. The delayed branch scheduler
> would therefore try to fill the delay slot with something from before
> the zeroing sequence, which is exactly what we'd like to avoid.
>
> Also, using an asm after the sequence would allow a machine_reorg
> pass to reuse the zeroed registers for something else between the
> second asm and the return.
>
> IMO, marking the zeroed registers as being live out of the function
> is the simplest, most direct way of representing the fact that the
> zeroing effect has to survive to the function return. It's how we
> make sure that the function return value remains live and how we make
> sure that the restored call-preserved registers remain live.
Okay, now I understand.
Thanks a lot for your patience.
Qing
>
> Thanks,
> Richard