Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>> On 27 Nov 2024, at 09:34, Richard Sandiford <richard.sandif...@arm.com> 
>> wrote:
>> 
>> Soumya AR <soum...@nvidia.com> writes:
>>> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain 
>>> operands
>>> inverted. These can be extended to work with Neon modes.
>>> 
>>> Since these instructions are unpredicated, duplicate patterns were added 
>>> with
>>> the predicate removed to generate these instructions for Neon modes.
>>> 
>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>> regression.
>>> OK for mainline?
>>> 
>>> Signed-off-by: Soumya AR <soum...@nvidia.com>
>>> 
>>> gcc/ChangeLog:
>>> 
>>> * config/aarch64/aarch64-sve2.md
>>> (*aarch64_sve2_nbsl_unpred<mode>): New pattern to match unpredicated
>>> form.
>>> (*aarch64_sve2_bsl1n_unpred<mode>): Likewise.
>>> (*aarch64_sve2_bsl2n_unpred<mode>): Likewise.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> * gcc.target/aarch64/sve/bitsel.c: New test.
>> 
>> Thanks for the patch.  But since this is a new optimisation, and is not
>> fixing a regression, I'm not sure whether it would be appropriate during
>> stage 3.  Let's see what other maintainers say.
>
> IMO it’s not high risk but it’s a nice-to-have optimisation rather than 
> driven by a concrete motivating workload.
> Given that we have a few such patches (like the ASRD patch from Soumya) it 
> would be consistent to either take them all now or stage them all for GCC 16.

Yeah, agreed.  I'd chosen this patch somewhat arbitrarily, but it was
really a comment about the ongoing work in general.

> I’d be okay with deferring them to GCC 16 but would appreciate if they 
> received some feedback on the implementation beforehand so they can be 
> polished for next stage1.

Sure, will try to get to them soon.

I'm also not strongly opposed to the patches going in.  Full disclosure:
there are some bits of FP8 work that (despite our best efforts) slipped
into stage 3 due to unforeseen circumstances, and still need to be posted.
I'm hoping they can still go in, since the alternative would be to
disable all the existing FP8 work for GCC 15.

Given that, it probably seems hypocritical to push back on these SVE-for-
NEON patches.  The reason I did is that the work seems like an ongoing
project with no well-defined end point, so it seemed like the GCC 15
cut-off would have to be time-driven rather than feature-driven.

Thanks for all the work on this though -- it's definitely a useful project.

Richard

Reply via email to