https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052
--- Comment #12 from Matthias Kretz <kretz at kde dot org> --- (In reply to Jakub Jelinek from comment #11) > [...] though for 8x conversions we > are e.g. on x86 already outside of the realm of natively supported vectors > (we don't really want MMX and for 1024 bit and wider generic vectors we > don't always emit best code). Creatively thinking, consider constants stored as (u)char arrays (for bandwith optimization), converted to double or (u)llong when used. I'd want to use a half-SSE load + subsequent conversion to AVX-512 vector (e.g. vpmovsxbq + vcvtqq2pd) or even full SSE load + one shift and two conversions to AVX-512. Similar motivation for the reverse direction. (Though a lot less likely to be used in practice, I believe. Hmm, maybe AI applications can prove that expectation wrong.) But we should track optimizations in their own issues.