Hi, Le lauantaina 22. kesäkuuta 2024, 18.58.03 EEST u...@foxmail.com a écrit : > From: sunyuechi <sunyue...@iscas.ac.cn>
In my opinion, we can't keep on like this. By the end of year, there will also be 512-bit vector hardware. In the worst case, specialisation on vector length could require 7 variants of every function, as many as legal LMUL values. Generating the LMUL at run time or initialisation time is too slow for fixed- size functions, so I can only see two viable options here: 1) We ignore this problem entirely and only optimise to 128-bit or to the current minimum VLEN. The intent of the specification is ostensibly that processing should scale according to the current value of VL, not VTYPE.LMUL. That is why the minimum legal LMUL value is SEW/ELEN rather than 1/VLMAX (and draft versions did not even have fractional multipliers). 2) The specialisation code is heavily factored, including in the C initialisation side. Personally, I prefer to ignore the problem until we see more mature and varied hardware. I do note that SiFive is ostensibly not specialising their code by VLEN, which tends to confirm that this is just a case of immature design from T-Head. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".