> On 29 Nov 2024, at 14:16, Richard Sandiford <richard.sandif...@arm.com> wrote:
> 
> Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>>> On 27 Nov 2024, at 09:34, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>> 
>>> Soumya AR <soum...@nvidia.com> writes:
>>>> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain 
>>>> operands
>>>> inverted. These can be extended to work with Neon modes.
>>>> 
>>>> Since these instructions are unpredicated, duplicate patterns were added 
>>>> with
>>>> the predicate removed to generate these instructions for Neon modes.
>>>> 
>>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>>> regression.
>>>> OK for mainline?
>>>> 
>>>> Signed-off-by: Soumya AR <soum...@nvidia.com>
>>>> 
>>>> gcc/ChangeLog:
>>>> 
>>>> * config/aarch64/aarch64-sve2.md
>>>> (*aarch64_sve2_nbsl_unpred<mode>): New pattern to match unpredicated
>>>> form.
>>>> (*aarch64_sve2_bsl1n_unpred<mode>): Likewise.
>>>> (*aarch64_sve2_bsl2n_unpred<mode>): Likewise.
>>>> 
>>>> gcc/testsuite/ChangeLog:
>>>> 
>>>> * gcc.target/aarch64/sve/bitsel.c: New test.
>>> 
>>> Thanks for the patch.  But since this is a new optimisation, and is not
>>> fixing a regression, I'm not sure whether it would be appropriate during
>>> stage 3.  Let's see what other maintainers say.
>> 
>> IMO it’s not high risk but it’s a nice-to-have optimisation rather than 
>> driven by a concrete motivating workload.
>> Given that we have a few such patches (like the ASRD patch from Soumya) it 
>> would be consistent to either take them all now or stage them all for GCC 16.
> 
> Yeah, agreed.  I'd chosen this patch somewhat arbitrarily, but it was
> really a comment about the ongoing work in general.
> 
>> I’d be okay with deferring them to GCC 16 but would appreciate if they 
>> received some feedback on the implementation beforehand so they can be 
>> polished for next stage1.
> 
> Sure, will try to get to them soon.
> 
> I'm also not strongly opposed to the patches going in.  Full disclosure:
> there are some bits of FP8 work that (despite our best efforts) slipped
> into stage 3 due to unforeseen circumstances, and still need to be posted.
> I'm hoping they can still go in, since the alternative would be to
> disable all the existing FP8 work for GCC 15.
> 
> Given that, it probably seems hypocritical to push back on these SVE-for-
> NEON patches.  The reason I did is that the work seems like an ongoing
> project with no well-defined end point, so it seemed like the GCC 15
> cut-off would have to be time-driven rather than feature-driven.

Yeah no problem. I’d like to see FP8 land properly in GCC 15 too.
Thanks,
Kyrill

> 
> Thanks for all the work on this though -- it's definitely a useful project.
> 
> Richard


Reply via email to