Artem-B wrote: Found another issue. We merge four independent byte loads with `align 1` into a 32-bit load, which fails at runtime on misaligned pointers.
``` %t0 = type { [17 x i8] } @shared_storage = linkonce_odr local_unnamed_addr addrspace(3) global %t0 undef, align 1 define <4 x i8> @in_v4i8(<4 x i8> %x, <4 x i8> %y) nounwind { %v = load <4 x i8>, ptr getelementptr inbounds (i8, ptr addrspacecast (ptr addrspace(3) @shared_storage to ptr), i64 9), align 1 ret <4 x i8> %v } ``` ``` mov.u64 %rd1, shared_storage; cvta.shared.u64 %rd2, %rd1; ld.u32 %r1, [%rd2+9]; st.param.b32 [func_retval0+0], %r1; ret; ``` https://github.com/llvm/llvm-project/pull/67866 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits