On Thu, 2023-11-23 at 15:31 +0800, chenglulu wrote:
> I modified this code to use define_expand:
> 
>      (define_expand "fix_trunc<mode><vimode>2"
>        [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
>              (fix:<VIMODE> (match_operand:FVEC 1 "register_operand" "f")))]
>        ""
>        {
>          emit_insn (gen_<simd_isa>_<x>vftintrz_<simdifmt_for_f>_<simdfmt> (
>            operands[0], operands[1]));
>          DONE;
>        }
>        [(set_attr "type" "simd_fcvt")
>         (set_attr "mode" "<MODE>")])

For

float x[4];
int y[4];

void test()
{
        for (int i = 0; i < 4; i++)
                y[i] = __builtin_rintf(x[i]);
}

it produces

        la.local        $r12,.LANCHOR0
        vld     $vr0,$r12,0
        vfrint.s        $vr0,$vr0
        vftintrz.w.s    $vr0,$vr0
        vst     $vr0,$r12,16
        jr      $r1

But with a define_insn or define_insn_and_split:

        la.local        $r12,.LANCHOR0
        vld     $vr0,$r12,0
        vftint.w.s      $vr0,$vr0
        vst     $vr0,$r12,16
        jr      $r1

(Our scalar code also generates sub-optimal frint.s-ftintxx.w.s
sequences.  I guess should fix the scalar code later as well.)

-- 
Xi Ruoyao <xry...@xry111.site>
School of Aerospace Science and Technology, Xidian University

Reply via email to