Hi,

Le lauantaina 22. kesäkuuta 2024, 18.58.03 EEST u...@foxmail.com a écrit :
> From: sunyuechi <sunyue...@iscas.ac.cn>

In my opinion, we can't keep on like this. By the end of year, there will also 
be 512-bit vector hardware. In the worst case, specialisation on vector length 
could require 7 variants of every function, as many as legal LMUL values.

Generating the LMUL at run time or initialisation time is too slow for fixed-
size functions, so I can only see two viable options here:

1) We ignore this problem entirely and only optimise to 128-bit or to the 
current minimum VLEN. The intent of the specification is ostensibly that 
processing should scale according to the current value of VL, not VTYPE.LMUL. 
That is why the minimum legal LMUL value is SEW/ELEN rather than 1/VLMAX (and 
draft versions did not even have fractional multipliers).

2) The specialisation code is heavily factored, including in the C 
initialisation side.

Personally, I prefer to ignore the problem until we see more mature and varied 
hardware. I do note that SiFive is ostensibly not specialising their code by 
VLEN, which tends to confirm that this is just a case of immature design from 
T-Head.

-- 
Rémi Denis-Courmont
http://www.remlab.net/



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to