arm: Convert Neon 64-bit element 3-reg-same insns

Richard Henderson Fri, 01 May 2020 09:17:23 -0700

On 5/1/20 8:54 AM, Peter Maydell wrote:
> On Thu, 30 Apr 2020 at 21:54, Richard Henderson
> <richard.hender...@linaro.org> wrote:
>>
>> On 4/30/20 11:09 AM, Peter Maydell wrote:
>>> +
>>> +    rn = tcg_temp_new_i64();
>>> +    rm = tcg_temp_new_i64();
>>> +    rd = tcg_temp_new_i64();
>>> +
>>> +    for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
>>> +        neon_load_reg64(rn, a->vn + pass);
>>> +        neon_load_reg64(rm, a->vm + pass);
>>> +        fn(rd, rm, rn);
>>> +        neon_store_reg64(rd, a->vd + pass);
>>> +    }
>>> +
>>> +    tcg_temp_free_i64(rn);
>>> +    tcg_temp_free_i64(rm);
>>> +    tcg_temp_free_i64(rd);
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +#define DO_3SAME_64(INSN, FUNC)                                         \
>>> +    static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a)        \
>>> +    {                                                                   \
>>> +        return do_3same_64(s, a, FUNC);                                 \
>>> +    }
>>
>> You can morph this into the gvec interface like so:
>>
>> #define DO_3SAME_64(INSN, FUNC) \
>>     static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,
>>                                 uint32_t rn_ofs, uint32_t rm_ofs,
>>                                 uint32_t oprsz, uint32_t maxsz)
>>     {
>>         static const GVecGen3 op = { .fni8 = FUNC };
>>         tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
>>                        oprsz, maxsz, &op);
>>     }
>>     DO_3SAME(INSN, gen_##INSN##_3s)
>>
>> The .fni8 function tells gvec that we have a helper that processes the
>> operation in 8 byte chunks.  It will handle the pass loop for you.
> 
> This doesn't quite work, because these are shift ops and
> so the operands are passed to the helper in the order
> rd, rm, rn. Reshuffling the order of arguments to
> tcg_gen_gvec_3() fixes this, though.
> 
> I guess I should call the macro DO_3SAME_SHIFT64, I hadn't
> noticed it was shift specific because the only thing we do
> with it is shifts.


See my reply to patch 26.  I think we should swap these operands during decode.


r~

Re: [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same insns

Reply via email to