On Mon, 29 Jan 2024, Jakub Jelinek wrote: > On Mon, Jan 29, 2024 at 11:24:58AM +0100, Richard Biener wrote: > > The following expands .VEC_SET and .VEC_EXTRACT instruction selection > > to global hard registers, not only automatic variables (possibly) > > promoted to registers. This can avoid some ICEs later and create > > better code. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > OK? > > > > Thanks, > > Richard. > > > > PR middle-end/113622 > > * gimple-isel.cc (gimple_expand_vec_set_extract_expr): > > Also allow DECL_HARD_REGISTER variables. > > > > * gcc.target/i386/pr113622-1.c: New testcase. > > --- > > gcc/gimple-isel.cc | 3 ++- > > gcc/testsuite/gcc.target/i386/pr113622-1.c | 12 ++++++++++++ > > 2 files changed, 14 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr113622-1.c > > > > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc > > index 7e2392ecd38..e94f292dd38 100644 > > --- a/gcc/gimple-isel.cc > > +++ b/gcc/gimple-isel.cc > > @@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function > > *fun, > > machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); > > machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref)); > > > > - if (auto_var_in_fn_p (view_op0, fun->decl) > > + if ((auto_var_in_fn_p (view_op0, fun->decl) > > + || DECL_HARD_REGISTER (view_op0)) > > && !TREE_ADDRESSABLE (view_op0) > > && ((!is_extract && can_vec_set_var_idx_p (outermode)) > > || (is_extract > > All we know here from the earlier checks is DECL_P (view_op0), but > DECL_HARD_REGISTER uses VAR_DECL_CHECK, shouldn't this be > || (VAR_P (view_op0) && DECL_HARD_REGISTER (view_op0))) > instead?
Ah, yeah - will fix. > > diff --git a/gcc/testsuite/gcc.target/i386/pr113622-1.c > > b/gcc/testsuite/gcc.target/i386/pr113622-1.c > > new file mode 100644 > > index 00000000000..2d6cb3c89a8 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr113622-1.c > > @@ -0,0 +1,12 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -mavx512f -w" } */ > > + > > +typedef float __attribute__ ((vector_size (64))) vec; > > +register vec a asm("zmm2"), b asm("zmm0"), c asm("zmm1"); > > I'd feel better if this used say zmm5, zmm6, zmm7 or something similar > so that it doesn't clash with some of the implicitly used SSE > registers, but on the other side still fit into 8 SSE registers > which ia32 has access to. OK, will adjust. Thanks, Richard. > > + > > +void > > +test (void) > > +{ > > + for (int i = 0; i < 8; i++) > > + c[i] = a[i] < b[i] ? 0.1 : 0.2; > > +} > > Otherwise LGTM. > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)