Hi,

Le 24 février 2024 03:07:36 GMT+02:00, flow gg <hlefthl...@gmail.com> a écrit :
> .ifc \len,4
>-        vsetivli        zero, 5, e8, mf2, ta, ma
>+        vsetivli        zero, 5, e8, m1, ta, ma
> .elseif \len == 8
>         vsetivli        zero, 9, e8, m1, ta, ma
> .else
>@@ -112,9 +112,9 @@ endfunc
>         vslide1down.vx  v2, \dst, t5
>
> .ifc \len,4
>-        vsetivli        zero, 4, e8, mf4, ta, ma
>+        vsetivli        zero, 4, e8, m1, ta, ma
> .elseif \len == 8
>-        vsetivli        zero, 8, e8, mf2, ta, ma
>+        vsetivli        zero, 8, e8, m1, ta, ma
>
>What are the benefits of not using fractional multipliers here?

Insofar as E8/MF4 is guaranteed to work for Zve32x, there are no benefits per 
se.

However fractional multipliers were added to the specification to enable 
addressing invididual vectors whilst the effective multiplier is larger than 
one. This can only happen with mixed widths. Fractions were not intended to 
make vector shorter - there is the vector length for that already.

That's why "E64/MF2" doesn't work, even though it's the same vector bit size as 
"E8/MF2".

> Making this
>change would result in a 10%-20% slowdown.

That's kind of odd. This may be caused by the slides, but it's strange to go 
out of the way for hardware to optimise a case that's not even intended.

>                                              mf2/4   m1
>vp8_put_bilin4_h_rvv_i32:   158.7   193.7
>vp8_put_bilin4_hv_rvv_i32:  255.7   302.7
>vp8_put_bilin8_h_rvv_i32:   318.7   358.7
>vp8_put_bilin8_hv_rvv_i32:  528.7   569.7
>
>Rémi Denis-Courmont <r...@remlab.net> 于2024年2月24日周六 01:18写道:
>
>> Hi,
>>
>> +
>> +.macro bilin_h_load dst len
>> +.ifc \len,4
>> +        vsetivli        zero, 5, e8, mf2, ta, ma
>>
>> Don't use fractional multipliers if you don't mix element widths.
>>
>> +.elseif \len == 8
>> +        vsetivli        zero, 9, e8, m1, ta, ma
>> +.else
>> +        vsetivli        zero, 17, e8, m2, ta, ma
>> +.endif
>> +
>> +        vle8.v          \dst, (a2)
>> +        vslide1down.vx  v2, \dst, t5
>> +
>>
>> +.ifc \len,4
>> +        vsetivli        zero, 4, e8, mf4, ta, ma
>>
>> Same as above.
>>
>> +.elseif \len == 8
>> +        vsetivli        zero, 8, e8, mf2, ta, ma
>>
>> Also.
>>
>> +.else
>> +        vsetivli        zero, 16, e8, m1, ta, ma
>> +.endif
>>
>> +        vwmulu.vx       v28, \dst, t1
>> +        vwmaccu.vx      v28, a5, v2
>> +        vwaddu.wx       v24, v28, t4
>> +        vnsra.wi        \dst, v24, 3
>> +.endm
>> +
>> +.macro put_vp8_bilin_h len
>> +        li              t1, 8
>> +        li              t4, 4
>> +        li              t5, 1
>> +        sub             t1, t1, a5
>> +1:
>> +        addi            a4, a4, -1
>> +        bilin_h_load    v0, \len
>> +        vse8.v          v0, (a0)
>> +        add             a2, a2, a3
>> +        add             a0, a0, a1
>> +        bnez            a4, 1b
>> +
>> +        ret
>> +.endm
>> +
>> +func ff_put_vp8_bilin16_h_rvv, zve32x
>> +        put_vp8_bilin_h 16
>> +endfunc
>> +
>> +func ff_put_vp8_bilin8_h_rvv, zve32x
>> +        put_vp8_bilin_h 8
>> +endfunc
>> +
>> +func ff_put_vp8_bilin4_h_rvv, zve32x
>> +        put_vp8_bilin_h 4
>> +endfunc
>>
>> --
>> レミ・デニ-クールモン
>> http://www.remlab.net/
>>
>>
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>>
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel@ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to