https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

            Bug ID: 101621
           Summary: gcc cannot optimize int8_t vector assign with
                    subscription to shuffle
           Product: gcc
           Version: 11.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yumeyao at gmail dot com
  Target Milestone: ---

https://gcc.godbolt.org/z/91cqenf99

typedef char v16b __attribute__((vector_size(16)));

To summary it up, regarding optimizing v = { v[n] ...} into shuffle, targeting
Intel x86(x86_64):
These is a lack of optimization when there is a zero
There is some regression starting from gcc9.
so this might be 2 issues. But I think a proper fix could resolve both.


* gcc can optimize int8_t vector assign with subscription of the same vector to
shuffle, like this:
v16b gcc_can_shuffle(v16b b) {
    return (v16b) {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
}

* However, if there is a zero, gcc can't handle this. Actually this is
supported on Intel x86, with a negative subscription indicating the 'zero
value'.
Clang can do the optimization starting with clang 5.

* Furthermore, there is a regression:
gcc < 8 can always optimize it, but starting with gcc9, if there is a cast,
then the optimization fails:
typedef long v2si64 __attribute__((vector_size(16)));
v16b gcc_cannot_shuffle_with_cast(v2si64 x) {
    v16b b = (v16b)x;
    v16b b0 = {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
    return b0;
}
gcc 11 can optimize it on -O3, but not on -O1 or -O2.

Reply via email to