On Thu, Sep 24, 2020 at 10:21 AM xionghu luo <luo...@linux.ibm.com> wrote:
>
> Hi Segher,
>
> The attached two patches are updated and split from
>  "[PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple 
> [PR79251]"
> as your comments.
>
>
> [PATCH v3 2/3] rs6000: Fix lvsl&lvsr mode and change rs6000_expand_vector_set 
> param
>
> This one is preparation work of fix lvsl&lvsr arg mode and 
> rs6000_expand_vector_set
> parameter support for both constant and variable index input.
>
>
> [PATCH v3 2/3] rs6000: Support variable insert and Expand vec_insert in 
> expander [PR79251]
>
> This one is Building VIEW_CONVERT_EXPR and expand the IFN VEC_SET to fast.

I'll just comment that

        xxperm 34,34,33
        xxinsertw 34,0,12
        xxperm 34,34,32

doesn't look like a variable-position insert instruction but
this is a variable whole-vector rotate plus an insert at index zero
followed by a variable whole-vector rotate.  I'm not fluend in
ppc assembly but

        rlwinm 6,6,2,28,29
        mtvsrwz 0,5
        lvsr 1,0,6
        lvsl 0,0,6

possibly computes the shift masks for r33/r32?  though
I do not see those registers mentioned...

This might be a generic viable expansion strathegy btw,
which is why I asked before whether the CPU supports
inserts at a variable position ...  the building blocks are
already there with vec_set at constant zero position
plus vec_perm_const for the rotates.

But well, I did ask this question.  Multiple times.

ppc does _not_ have a VSX instruction
like xxinsertw r34, r8, r12 where r8 denotes
the vector element (or byte position or whatever).

So I don't think vec_set with a variable index is the
best approach.
Xionghu - you said even without the patch the stack
storage is eventually elided but

        addi 9,1,-16
        rldic 6,6,2,60
        stxv 34,-16(1)
        stwx 5,9,6
        lxv 34,-16(1)

still shows stack(?) store/load with a bad STLF penalty.

Richard.

>
> Thanks,
> Xionghu

Reply via email to