On Mon, 29 Jan 2024, Jakub Jelinek wrote:

> On Mon, Jan 29, 2024 at 11:24:58AM +0100, Richard Biener wrote:
> > The following expands .VEC_SET and .VEC_EXTRACT instruction selection
> > to global hard registers, not only automatic variables (possibly)
> > promoted to registers.  This can avoid some ICEs later and create
> > better code.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > OK?
> > 
> > Thanks,
> > Richard.
> > 
> >     PR middle-end/113622
> >     * gimple-isel.cc (gimple_expand_vec_set_extract_expr):
> >     Also allow DECL_HARD_REGISTER variables.
> > 
> >     * gcc.target/i386/pr113622-1.c: New testcase.
> > ---
> >  gcc/gimple-isel.cc                         |  3 ++-
> >  gcc/testsuite/gcc.target/i386/pr113622-1.c | 12 ++++++++++++
> >  2 files changed, 14 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113622-1.c
> > 
> > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> > index 7e2392ecd38..e94f292dd38 100644
> > --- a/gcc/gimple-isel.cc
> > +++ b/gcc/gimple-isel.cc
> > @@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function 
> > *fun,
> >        machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> >        machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
> >  
> > -      if (auto_var_in_fn_p (view_op0, fun->decl)
> > +      if ((auto_var_in_fn_p (view_op0, fun->decl)
> > +      || DECL_HARD_REGISTER (view_op0))
> >       && !TREE_ADDRESSABLE (view_op0)
> >       && ((!is_extract && can_vec_set_var_idx_p (outermode))
> >           || (is_extract
> 
> All we know here from the earlier checks is DECL_P (view_op0), but
> DECL_HARD_REGISTER uses VAR_DECL_CHECK, shouldn't this be
>          || (VAR_P (view_op0) && DECL_HARD_REGISTER (view_op0)))
> instead?

Ah, yeah - will fix.

> > diff --git a/gcc/testsuite/gcc.target/i386/pr113622-1.c 
> > b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> > new file mode 100644
> > index 00000000000..2d6cb3c89a8
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr113622-1.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx512f -w" } */
> > +
> > +typedef float __attribute__ ((vector_size (64))) vec;
> > +register vec a asm("zmm2"), b asm("zmm0"), c asm("zmm1");
> 
> I'd feel better if this used say zmm5, zmm6, zmm7 or something similar
> so that it doesn't clash with some of the implicitly used SSE
> registers, but on the other side still fit into 8 SSE registers
> which ia32 has access to.

OK, will adjust.

Thanks,
Richard.

> > +
> > +void
> > +test (void)
> > +{
> > +  for (int i = 0; i < 8; i++)
> > +    c[i] = a[i] < b[i] ? 0.1 : 0.2;
> > +}
> 
> Otherwise LGTM.
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to