Hi Kyrill, thanks for the very quick response!

On 02/12/2024 15:09, Kyrylo Tkachov wrote:
Thanks for the patch. As this is sent after the end of stage1 and is not 
finishing support for an architecture feature perhaps we should stage this for 
GCC 16.
But if it fixes a performance problem in a real app or, better yet, fixes a 
performance regression then we should consider it for this cycle.
Sorry, I should have specified in the cover letter that this was originally intended for GCC 16... although it would improve performance in some video codecs as this is where the issue was first raised.I'll try and find out a bit more about this if needed.
… The UZP1 instruction doesn’t accept .2h operands so I don’t think this 
pattern is valid for the V2SF value of VDQHSD_F
We should have tests for the various sizes that the new pattern covers.

Okay, I'll correct the modes and then write tests for the ones that remain.

Many thanks,
Akram

Reply via email to