On 2021/6/9 05:07, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Jun 08, 2021 at 09:11:33AM +0800, Xionghu Luo wrote:
>> On P8LE, extra rot64+rot64 load or store instructions are generated
>> in float128 to vector __int128 conversion.
>>
>> This patch teaches pass swaps to also handle such pattens to remove
>> extra swap instructions.
>
>> +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0. */
>> +
>> +static bool
>> +pattern_is_rotate64_p (rtx pat)
>
> You already have a verb in the name, don't use _p please (and preferably
> just don't use it at all, "pattern_is_rotate64" is much better than
> "pattern_rotate64_p").
>
>> +{
>> + rtx rot = SET_SRC (pat);
>
> So this is assuming PAT is a SINGLE_SET. Please say that in the
> function comment.
>
> /* Return 1 iff PAT (a SINGLE_SET) is a rotate 64 bit expression; else
> return 0. */
>
> You can do an assert for that as well, but I wouldn't bother.
>
>> @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn)
>
> (I do realise you just copied existing naming, don't worry :-) )
>
>> @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry,
>> rtx_insn *insn)
>> false. */
>> rtx body = PATTERN (def_insn);
>> if (GET_CODE (body) != SET
>> - || GET_CODE (SET_SRC (body)) != VEC_SELECT
>> + || !(GET_CODE (SET_SRC (body)) == VEC_SELECT
>> + || pattern_is_rotate64_p (body))
>
> Broken indentation: the || should align with "pattern...".
>
>> @@ -2223,9 +2246,9 @@ static void
>> recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete)
>> {
>> rtx body = PATTERN (insn);
>> - gcc_assert (GET_CODE (body) == SET
>> - && MEM_P (SET_DEST (body))
>> - && GET_CODE (SET_SRC (body)) == VEC_SELECT);
>> + gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body))
>> + && (GET_CODE (SET_SRC (body)) == VEC_SELECT
>> + || pattern_is_rotate64_p (body)));
>
> Please start a new line for every "&&" here. The way it was was more
> readable.
>
> It often is nice to keep things one one line, if it fits on one line.
> If it does not, make a new line for every phrase. This is more readable
> because you can then just scan down the line of "&&" and see the start
> of every phrase without actually having to read it all.
>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> b/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> index 5895416e985..a1f09df8a57 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> @@ -21,5 +21,5 @@
>> TYPE one (void) { return ONE; }
>> void store (TYPE a, TYPE *p) { *p = a; }
>>
>> -/* { dg-final { scan-assembler "lxvd2x 34" } } */
>> -/* { dg-final { scan-assembler "stxvd2x 34" } } */
>> +/* { dg-final { scan-assembler "lvx 2" } } */
>> +/* { dg-final { scan-assembler "stvx 2" } } */
>
> Huh. Is that correct? Where did the other 32 loads and stores go? Are
> there now other insns generated that you should scan for?
This is expected change. lxvd2x+xxpermdi is replaced by lvx. No need scan other
instructions. Similarly for stvx. 34 and 2 are *vector register names* instead
of
counts.
diff float128-call.trunk.s float128-call.patched.s
18,19c18
< lxvd2x 34,0,9
< xxpermdi 34,34,34,2
---
> lvx 2,0,9
33,34c32
< xxpermdi 34,34,34,2
< stxvd2x 34,0,5
---
> stvx 2,0,5
Thanks for all the other comments, updated and committed with r12-1316.
BR,
Xionghu