> Am 04.12.2024 um 16:26 schrieb Tamar Christina <tamar.christ...@arm.com>: > > >> >> -----Original Message----- >> From: Richard Biener <rguent...@suse.de> >> Sent: Wednesday, December 4, 2024 2:43 PM >> To: Tamar Christina <tamar.christ...@arm.com> >> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com> >> Subject: Re: [PATCH 4/7]middle-end: Fix mask length arg in call to >> vect_get_loop_mask [PR96342] >> >>> On Wed, 4 Dec 2024, Tamar Christina wrote: >>> >>> Hi All, >>> >>> When issuing multiple calls to a simdclone in a vectorized loop, >>> TYPE_VECTOR_SUBPARTS(vectype) gives the incorrect number when compared >>> to the TYPE_VECTOR_SUBPARTS result we get from the mask type derived >>> from the relevant `rgroup_controls' entry within `vect_get_loop_mask'. >>> >>> By passing `masktype' instead, we are able to get the correct number of >>> vector subparts and thu eliminate the ICE in the call to >>> `vect_get_loop_mask' when the data type for which we retrieve the mask >>> is wider than the one used when defining the mask at mask registration >>> time. >>> >>> gcc/ChangeLog: >>> >>> PR target/96342 >>> * tree-vect-stmts.cc (vectorizable_simd_clone_call): >>> s/vectype/masktype/ in call to vect_get_loop_mask. >>> >>> >>> Bootstrapped Regtested on aarch64-none-linux-gnu, >>> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu >>> -m32, -m64 and no issues. >>> >>> Ok for master? >> >> Hmm, I think the corresponding vect_record_loop_mask is >> >> case SIMD_CLONE_ARG_TYPE_MASK: >> if (loop_vinfo >> && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) >> { >> unsigned nmasks >> = exact_div (ncopies * bestn->simdclone->simdlen, >> TYPE_VECTOR_SUBPARTS >> (vectype)).to_constant (); >> vect_record_loop_mask (loop_vinfo, >> &LOOP_VINFO_MASKS (loop_vinfo), >> nmasks, vectype, op); >> } >> >> and that doesn't exactly match the bestn->simdclone->nargs - 1 >> mask use then unless bestn->simdclone->args[mask_i].vector_type >> == vectype? >> >> The patch looks OK, but I think we miss a corresponding >> vect_record_loop_mask case. I also wonder how we support a SIMD clone >> call where the result vector type subparts doesn't match the argument >> or loop mask number of subparts? >> > > I think it's correct because of the difference in type is accounted for In > the usage: > > As you posted, when we register a mask we use nmasks, because it's > ncopies * bestn->simdclone->simdlen. So nmasks represents the widest input. > > When using it we do: > > mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, > ncopies, vectype, j); > > My understanding is that the reason for this is that when registering the mask > we want to register the smaller type, and when using it, since you may have > decomposed in multiple vectors, vect_get_look_mask on the smaller type will > give you the unpacked mask. > > The problem was before that we passed the wrong type here. So we gave the > input > type rather than the expected decomposed mask. Yes, the Type Change definitely is OK. I just Wondered About The registering… Richard > > Thanks, > Tamar > >>> Thanks, >>> Tamar >>> >>> --- >>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc >>> index >> 497a31322accba8672b82dee00f5403b40dca22b..be1139a423c85608755f107 >> 50bb68e70223a84ea 100644 >>> --- a/gcc/tree-vect-stmts.cc >>> +++ b/gcc/tree-vect-stmts.cc >>> @@ -4964,7 +4964,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, >> stmt_vec_info stmt_info, >>> { >>> vec_loop_masks *loop_masks = &LOOP_VINFO_MASKS >> (loop_vinfo); >>> mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, >>> - ncopies, vectype, j); >>> + ncopies, masktype, j); >>> } >>> else >>> mask = vect_build_all_ones_mask (vinfo, stmt_info, masktype); >>> >>> >>> >>> >>> >> >> -- >> Richard Biener <rguent...@suse.de> >> SUSE Software Solutions Germany GmbH, >> Frankenstrasse 146, 90461 Nuernberg, Germany; >> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)