Hello, I have received the K230, and then installed Debian following your
method. Therefore, I have updated the benchmark of K230 in the patch of
this reply.
k230
vc1dsp.vc1_inv_trans_4x4_dc_c: 125.7
vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 53.5
vc1dsp.vc1_inv_trans_4x8_dc_c: 230.7
vc1
> FWIW CanMV-K230 boards are on sale for under 500 RMB.
I just made a payment ~ (I saw you mention in IRC that you're going to
write about K230+Debian. Looking forward to it)
Rémi Denis-Courmont 于2023年12月6日周三 04:11写道:
> Le tiistaina 5. joulukuuta 2023, 21.25.12 EET flow gg a écrit :
> > > This
I'm sorry for my carelessness.It's because I used to build and run
manually, but now I've switched to a script to do it, so I accidentally
missed the error.I will modify the script and to avoid this kind of issue
in the future.
libavcodec/riscv/vc1dsp_rvv.S:35: Error: improper CSRxI immediate
Cha
Le tiistaina 5. joulukuuta 2023, 21.25.12 EET flow gg a écrit :
> > This block can be folded into the next. You don't need to check VLENB
>
> twice.
>
> Changed.
>
> > Instruction scheduling could be better, especially on in-order CPUs.
>
> I put the vload at the front, and then proceeded with
> This block can be folded into the next. You don't need to check VLENB
twice.
Changed.
> Instruction scheduling could be better, especially on in-order CPUs.
I put the vload at the front, and then proceeded with the t2 operation, but
I'm not sure...
> You don't need to reset the AVL here, just
Hi,
> diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
> index 2d0e6c19c8..442c5961ea 100644
> --- a/libavcodec/riscv/Makefile
> +++ b/libavcodec/riscv/Makefile
> @@ -39,5 +39,7 @@ OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \
> RVV-OBJS-$(CONFIG_PIXBLOCKDSP)
Okay, after using zext, can delete two vset, which is better than splat. I
have updated the patch in this reply.
Rémi Denis-Courmont 于2023年12月4日周一 23:15写道:
> Le maanantaina 4. joulukuuta 2023, 10.48.56 EET flow gg a écrit :
> > > Probably missing VLENB checks.
> >
> > Changed.
> >
> > > You can
I found that in the case of nosplat, an additional vset can be removed, and
the time is basically the same, so I updated the patch.
Rémi Denis-Courmont 于2023年12月4日周一 23:15写道:
> Le maanantaina 4. joulukuuta 2023, 10.48.56 EET flow gg a écrit :
> > > Probably missing VLENB checks.
> >
> > Changed.
Le maanantaina 4. joulukuuta 2023, 10.48.56 EET flow gg a écrit :
> > Probably missing VLENB checks.
>
> Changed.
>
> > You can multiply by 3, 5 or 9 with shift-and-add. By 12 with shift-and-add
> > then shift, and by 17 with shift then add. You don't need multiplications.
>
> Changed.
>
> > Do
> Probably missing VLENB checks.
Changed.
> You can multiply by 3, 5 or 9 with shift-and-add. By 12 with shift-and-add
> then shift, and by 17 with shift then add. You don't need multiplications.
Changed.
> Do you really need to splat? Can't .vx or .wx be used instead?
Okay, for example in ff_
Le sunnuntaina 3. joulukuuta 2023, 16.40.08 EET flow gg a écrit :
> c910
> vc1dsp.vc1_inv_trans_4x4_dc_c: 84.0
> vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 74.0
> vc1dsp.vc1_inv_trans_4x8_dc_c: 150.2
> vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 83.5
> vc1dsp.vc1_inv_trans_8x4_dc_c: 129.0
>
c910
vc1dsp.vc1_inv_trans_4x4_dc_c: 84.0
vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 74.0
vc1dsp.vc1_inv_trans_4x8_dc_c: 150.2
vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 83.5
vc1dsp.vc1_inv_trans_8x4_dc_c: 129.0
vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 75.7
vc1dsp.vc1_inv_trans_8x8_dc_c:
12 matches
Mail list logo