Hi Juzhe, I find the bug description rather confusing. What I can see is that the constant in the literal pool is indeed wrong but how would DSE or so play a role there? Particularly only for the smaller modes?
My suspicion would be that the constant in the literal/constant pool is wrong from start to finish. I just played around with the following hunk: diff --git a/gcc/varasm.cc b/gcc/varasm.cc index 542315f88cd..5223c08924f 100644 --- a/gcc/varasm.cc +++ b/gcc/varasm.cc @@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align) whole element. Often this is byte_mode and contains more than one element. */ unsigned int nelts = GET_MODE_NUNITS (mode); - unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts; + unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts; unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT); scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require (); With this all your examples pass for me. We then pack e.g. 16 VNx2BI elements into an int and not just 8. It would also explain why it works for modes where PRECISION == BITSIZE. Now it will certainly require a more thorough analysis but maybe it's a start? Regards Robin