https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118311
Bug ID: 118311 Summary: Poorly optimized trivial integer serialization due to vectorizer on aarch64 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org CC: marco.rubini08 at gmail dot com, unassigned at gcc dot gnu.org Target Milestone: --- Target: aarch64 Codegen is suboptimal for the following code. Note that the output improves drastically after uncommenting the commented line. ``` void f(unsigned char* dst, unsigned long long low, unsigned long long high) { dst[0] = ((low >> 0) & 0xff); dst[1] = ((low >> 8) & 0xff); dst[2] = ((low >> 16) & 0xff); dst[3] = ((low >> 24) & 0xff); dst[4] = ((low >> 32) & 0xff); dst[5] = ((low >> 40) & 0xff); dst[6] = ((low >> 48) & 0xff); dst[7] = ((low >> 56) & 0xff); // asm volatile ("" ::: "memory"); dst[8] = ((high >> 0) & 0xff); dst[9] = ((high >> 8) & 0xff); dst[10] = ((high >> 16) & 0xff); dst[11] = ((high >> 24) & 0xff); dst[12] = ((high >> 32) & 0xff); dst[13] = ((high >> 40) & 0xff); dst[14] = ((high >> 48) & 0xff); dst[15] = ((high >> 56) & 0xff); } ``` For aarch64, the vectorizer cost model says vectoring is better for some reason.