[Bug target/95958] [meta-bug] Inefficient arm_neon.h code for AArch64

rsandifo at gcc dot gnu.org Mon, 29 Jun 2020 04:11:42 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95958


rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
         Depends on|                            |66675, 80283, 88212, 89057,
                   |                            |89606, 89967, 91753, 94442,
                   |                            |95265, 91598, 82074

--- Comment #1 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> 
---
At the time of writing, PR80283 doesn't have an AArch64 testcase,
only an AArch32 one.  However, the underlying issue applies across
targets.  The content probably overlaps a lot with PR91598, which
is AArch64-specific.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66675
[Bug 66675] Could improve vector lane folding style operations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
[Bug 80283] [8/9/10/11 Regression] bad SIMD register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82074
[Bug 82074] [aarch64] vmlsq_f32 compiled into 2 instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88212
[Bug 88212] IRA Register Coalescing not working for the testcase
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057
[Bug 89057] [8/9/10/11 Regression] AArch64 ld3 st4 less optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89606
[Bug 89606] Extra mov after structure load instructions on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89967
[Bug 89967] Inefficient code generation for vld2q_lane_u8 under aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598
[Bug 91598] [8/9 regression] 60% speed drop on neon intrinsic loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91753
[Bug 91753] Bad register allocation of multi-register types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94442
[Bug 94442] [10/11 regression] Redundant loads/stores emitted at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95265
[Bug 95265] aarch64: suboptimal code generation for common neon intrinsic
sequence involving shrn and mull

[Bug target/95958] [meta-bug] Inefficient arm_neon.h code for AArch64

Reply via email to