Andrew Pinski <quic_apin...@quicinc.com> writes: > __builtin_aarch64_im_lane_boundsi is known not to throw or call back into > another > function since it will either folded into an NOP or will produce a compiler > error. > > This fixes the ICE by fixing the missed optimization. It does not fix the > underlying > issue with fold_marked_statements; which I filed as PR 117668. > > Built and tested for aarch64-linux-gnu. > > PR target/117665 > > gcc/ChangeLog: > > * config/aarch64/aarch64-builtins.cc > (aarch64_init_simd_builtin_functions): > Pass nothrow and leaf as attributes to aarch64_general_add_builtin for > __builtin_aarch64_im_lane_boundsi. > > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/lane-bound-1.C: New test. > * gcc.target/aarch64/lane-bound-3.c: New test. > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com>
Yeah, ok. I'm a bit nervous about making this builtin easier to optimise, but the real fix for that of course is to do the checking in the frontend (as for the new pragma-based approach). The C++ test is convincing that (a) we do still emit the error for obvious dead code and (b) not marking the function has a user-visible effect. Thanks, Richard > --- > gcc/config/aarch64/aarch64-builtins.cc | 6 ++++- > .../g++.target/aarch64/lane-bound-1.C | 21 +++++++++++++++ > .../gcc.target/aarch64/lane-bound-3.c | 27 +++++++++++++++++++ > 3 files changed, 53 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.target/aarch64/lane-bound-1.C > create mode 100644 gcc/testsuite/gcc.target/aarch64/lane-bound-3.c > > diff --git a/gcc/config/aarch64/aarch64-builtins.cc > b/gcc/config/aarch64/aarch64-builtins.cc > index b860e22f01f..e26ee323a2d 100644 > --- a/gcc/config/aarch64/aarch64-builtins.cc > +++ b/gcc/config/aarch64/aarch64-builtins.cc > @@ -1482,10 +1482,14 @@ aarch64_init_simd_builtin_functions (bool > called_from_pragma) > size_type_node, > intSI_type_node, > NULL); > + /* aarch64_im_lane_boundsi should be leaf and nothrow as it > + is expanded as nop or will cause an user error. */ > + tree attrs = aarch64_add_attribute ("nothrow", NULL_TREE); > + attrs = aarch64_add_attribute ("leaf", attrs); > aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_LANE_CHECK] > = aarch64_general_add_builtin ("__builtin_aarch64_im_lane_boundsi", > lane_check_fpr, > - AARCH64_SIMD_BUILTIN_LANE_CHECK); > + AARCH64_SIMD_BUILTIN_LANE_CHECK, attrs); > } > > for (i = 0; i < ARRAY_SIZE (aarch64_simd_builtin_data); i++, fcode++) > diff --git a/gcc/testsuite/g++.target/aarch64/lane-bound-1.C > b/gcc/testsuite/g++.target/aarch64/lane-bound-1.C > new file mode 100644 > index 00000000000..cb3e99816a1 > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/lane-bound-1.C > @@ -0,0 +1,21 @@ > +// { dg-do compile } > +// { dg-options "" } > +#include <arm_neon.h> > + > +// vgetq_lane_u64 should not cause any > +// exceptions to thrown so even at -O0 > +// removeme should have been removed. > +void removeme() > +__attribute__((error("nothrow"))); > +int _setjmp(); > +void hh(uint64x2_t c, int __b) > +{ > + try { > + vgetq_lane_u64(c, __b); > + // { dg-error "must be a constant immediate" "" { target *-*-* } 0 } > + } catch (...) > + { > + removeme(); // { dg-bogus "declared with attribute error" } > + } > +} > + > diff --git a/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c > b/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c > new file mode 100644 > index 00000000000..9e0dad372cb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +/* PR target/117665 */ > +/* __builtin_aarch64_im_lane_boundsi was causing an abnormal > + edge to the setjmp but then the builtin was folded into a nop > + and that edge was never removed but the edge was not needed in > + the first place. */ > + > +#include <arm_neon.h> > + > +__attribute__((always_inline)) > +static inline > +void h(uint64x2_t c, int __b) { > + /* Use vgetq_lane_u64 to get a > + __builtin_aarch64_im_lane_boundsi */ > + vgetq_lane_u64(c, __b); > + > + __builtin_unreachable(); > +} > + > +int _setjmp(); > +void hh(uint64x2_t c) { > + int __b = 0; > + if (_setjmp()) > + h(c, 0); > +}