> On 29 Nov 2024, at 14:16, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Kyrylo Tkachov <ktkac...@nvidia.com> writes: >>> On 27 Nov 2024, at 09:34, Richard Sandiford <richard.sandif...@arm.com> >>> wrote: >>> >>> Soumya AR <soum...@nvidia.com> writes: >>>> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain >>>> operands >>>> inverted. These can be extended to work with Neon modes. >>>> >>>> Since these instructions are unpredicated, duplicate patterns were added >>>> with >>>> the predicate removed to generate these instructions for Neon modes. >>>> >>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no >>>> regression. >>>> OK for mainline? >>>> >>>> Signed-off-by: Soumya AR <soum...@nvidia.com> >>>> >>>> gcc/ChangeLog: >>>> >>>> * config/aarch64/aarch64-sve2.md >>>> (*aarch64_sve2_nbsl_unpred<mode>): New pattern to match unpredicated >>>> form. >>>> (*aarch64_sve2_bsl1n_unpred<mode>): Likewise. >>>> (*aarch64_sve2_bsl2n_unpred<mode>): Likewise. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> * gcc.target/aarch64/sve/bitsel.c: New test. >>> >>> Thanks for the patch. But since this is a new optimisation, and is not >>> fixing a regression, I'm not sure whether it would be appropriate during >>> stage 3. Let's see what other maintainers say. >> >> IMO it’s not high risk but it’s a nice-to-have optimisation rather than >> driven by a concrete motivating workload. >> Given that we have a few such patches (like the ASRD patch from Soumya) it >> would be consistent to either take them all now or stage them all for GCC 16. > > Yeah, agreed. I'd chosen this patch somewhat arbitrarily, but it was > really a comment about the ongoing work in general. > >> I’d be okay with deferring them to GCC 16 but would appreciate if they >> received some feedback on the implementation beforehand so they can be >> polished for next stage1. > > Sure, will try to get to them soon. > > I'm also not strongly opposed to the patches going in. Full disclosure: > there are some bits of FP8 work that (despite our best efforts) slipped > into stage 3 due to unforeseen circumstances, and still need to be posted. > I'm hoping they can still go in, since the alternative would be to > disable all the existing FP8 work for GCC 15. > > Given that, it probably seems hypocritical to push back on these SVE-for- > NEON patches. The reason I did is that the work seems like an ongoing > project with no well-defined end point, so it seemed like the GCC 15 > cut-off would have to be time-driven rather than feature-driven.
Yeah no problem. I’d like to see FP8 land properly in GCC 15 too. Thanks, Kyrill > > Thanks for all the work on this though -- it's definitely a useful project. > > Richard