https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118311

            Bug ID: 118311
           Summary: Poorly optimized trivial integer serialization due to
                    vectorizer on aarch64
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
                CC: marco.rubini08 at gmail dot com, unassigned at gcc dot 
gnu.org
  Target Milestone: ---
            Target: aarch64

Codegen is suboptimal for the following code.
Note that the output improves drastically after uncommenting the commented
line.

```
void f(unsigned char* dst, 
    unsigned long long low, 
    unsigned long long high)
{
    dst[0] = ((low >> 0) & 0xff);
    dst[1] = ((low >> 8) & 0xff);
    dst[2] = ((low >> 16) & 0xff);
    dst[3] = ((low >> 24) & 0xff);
    dst[4] = ((low >> 32) & 0xff);
    dst[5] = ((low >> 40) & 0xff);
    dst[6] = ((low >> 48) & 0xff);
    dst[7] = ((low >> 56) & 0xff);

    // asm volatile ("" ::: "memory");

    dst[8] = ((high >> 0) & 0xff);
    dst[9] = ((high >> 8) & 0xff);
    dst[10] = ((high >> 16) & 0xff);
    dst[11] = ((high >> 24) & 0xff);
    dst[12] = ((high >> 32) & 0xff);
    dst[13] = ((high >> 40) & 0xff);
    dst[14] = ((high >> 48) & 0xff);
    dst[15] = ((high >> 56) & 0xff);
}
```

For aarch64, the vectorizer cost model says vectoring is better for some
reason.

Reply via email to