> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Wednesday, December 4, 2024 2:43 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> Subject: Re: [PATCH 4/7]middle-end: Fix mask length arg in call to
> vect_get_loop_mask [PR96342]
> 
> On Wed, 4 Dec 2024, Tamar Christina wrote:
> 
> > Hi All,
> >
> > When issuing multiple calls to a simdclone in a vectorized loop,
> > TYPE_VECTOR_SUBPARTS(vectype) gives the incorrect number when compared
> > to the TYPE_VECTOR_SUBPARTS result we get from the mask type derived
> > from the relevant `rgroup_controls' entry within `vect_get_loop_mask'.
> >
> > By passing `masktype' instead, we are able to get the correct number of
> > vector subparts and thu eliminate the ICE in the call to
> > `vect_get_loop_mask' when the data type for which we retrieve the mask
> > is wider than the one used when defining the mask at mask registration
> > time.
> >
> > gcc/ChangeLog:
> >
> >     PR target/96342
> >     * tree-vect-stmts.cc (vectorizable_simd_clone_call):
> >     s/vectype/masktype/ in call to vect_get_loop_mask.
> >
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Ok for master?
> 
> Hmm, I think the corresponding vect_record_loop_mask is
> 
>             case SIMD_CLONE_ARG_TYPE_MASK:
>               if (loop_vinfo
>                   && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
>                 {
>                   unsigned nmasks
>                     = exact_div (ncopies * bestn->simdclone->simdlen,
>                                  TYPE_VECTOR_SUBPARTS
> (vectype)).to_constant ();
>                   vect_record_loop_mask (loop_vinfo,
>                                          &LOOP_VINFO_MASKS (loop_vinfo),
>                                          nmasks, vectype, op);
>                 }
> 
> and that doesn't exactly match the bestn->simdclone->nargs - 1
> mask use then unless bestn->simdclone->args[mask_i].vector_type
> == vectype?
> 
> The patch looks OK, but I think we miss a corresponding
> vect_record_loop_mask case.  I also wonder how we support a SIMD clone
> call where the result vector type subparts doesn't match the argument
> or loop mask number of subparts?
> 

I think it's correct because of the difference in type is accounted for In the 
usage:

As you posted, when we register a mask we use nmasks, because it's
ncopies * bestn->simdclone->simdlen. So nmasks represents the widest input.

When using it we do:

          mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks,
                         ncopies, vectype, j);

My understanding is that the reason for this is that when registering the mask
we want to register the smaller type, and when using it, since you may have
decomposed in multiple vectors, vect_get_look_mask on the smaller type will
give you the unpacked mask.

The problem was before that we passed the wrong type here.  So we gave the input
type rather than the expected decomposed mask.

Thanks,
Tamar

> > Thanks,
> > Tamar
> >
> > ---
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index
> 497a31322accba8672b82dee00f5403b40dca22b..be1139a423c85608755f107
> 50bb68e70223a84ea 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -4964,7 +4964,7 @@ vectorizable_simd_clone_call (vec_info *vinfo,
> stmt_vec_info stmt_info,
> >             {
> >               vec_loop_masks *loop_masks = &LOOP_VINFO_MASKS
> (loop_vinfo);
> >               mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks,
> > -                                        ncopies, vectype, j);
> > +                                        ncopies, masktype, j);
> >             }
> >           else
> >             mask = vect_build_all_ones_mask (vinfo, stmt_info, masktype);
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to