Le perjantaina 11. lokakuuta 2024, 13.40.20 EEST u...@foxmail.com a écrit :
> From: sunyuechi <sunyue...@iscas.ac.cn>
> +.macro put_uni_pixels w, vlen, id
> +\id\w\vlen:
> +.if \w == 128 && \vlen == 128
> +        li                t0, \w
> +        vsetvli           zero, t0, e8, m8, ta, ma
> +.else
> +        vsetvlstatic8     \w, \vlen
> +.endif
> +1:
> +        vle8.v            v0, (a2)
> +        addi              a4, a4, -1
> +        vse8.v            v0, (a0)
> +        add               a2, a2, a3
> +        add               a0, a0, a1
> +        bnez              a4, 1b
> +        ret

Up to 64-bit rows, you can use strided loads and stores here.

Though for memory copying, unaligned scalar accesses might be just as fast.

-- 
レミ・デニ-クールモン
http://www.remlab.net/



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to