On Fri, Feb 5, 2021 at 9:59 AM Kyrylo Tkachov via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi all, > > We can do better on the vget_low* intrinsics. > Currently they reinterpret their argument into a V2DI vector and extract the > low "lane", > reinterpreting that back into the shorter vector. > This is functionally correct and generates a sequence of subregs and a > vec_select that, by itself, > gets optimised away eventually. > However it's bad when we want to use the result in a other SIMD operations. > Then the subreg-vec_select-subreg combo blocks many combine patterns. > > This patch reimplements them to emit a proper low vec_select from the start. > It generates much cleaner RTL and allows for more aggressive combinations, > particularly > with the patterns that Jonathan has been pushing lately. > > Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. > Pushing to trunk.
Just to remind you folks that we're in stage4 which means fixes to regressions (or wrong-code) only. aarch64 is a primary target and you should provide a good example of following the rules we set up for GCC development. I'd expect _at least_ a short sentence on why you think this change is absolutely required for GCC 11. The change also comes with zero testcases and zero bug references. Sorry for this particular change taking the fire, I just picked a random one of the non-regression change-storm I'm seeing for arm/aarch64 recently. Thanks for your consideration, Richard. > Thanks, > Kyrill > > Thanks, > Kyrill > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd-builtins.def (get_low): Define builtin. > * config/aarch64/aarch64-simd.md (aarch64_get_low<mode>): Define. > * config/aarch64/arm_neon.h (__GET_LOW): Delete. > (vget_low_f16): Reimplement using new builtin. > (vget_low_f32): Likewise. > (vget_low_f64): Likewise. > (vget_low_p8): Likewise. > (vget_low_p16): Likewise. > (vget_low_p64): Likewise. > (vget_low_s8): Likewise. > (vget_low_s16): Likewise. > (vget_low_s32): Likewise. > (vget_low_s64): Likewise. > (vget_low_u8): Likewise. > (vget_low_u16): Likewise. > (vget_low_u32): Likewise. > (vget_low_u64): Likewise.