Hi Kyrill, thanks for the very quick response!
On 02/12/2024 15:09, Kyrylo Tkachov wrote:
Thanks for the patch. As this is sent after the end of stage1 and is not
finishing support for an architecture feature perhaps we should stage this for
GCC 16.
But if it fixes a performance problem in a real app or, better yet, fixes a
performance regression then we should consider it for this cycle.
Sorry, I should have specified in the cover letter that this was
originally intended for GCC 16... although it would improve performance
in some video codecs as this is where the issue was first raised.I'll
try and find out a bit more about this if needed.
… The UZP1 instruction doesn’t accept .2h operands so I don’t think this
pattern is valid for the V2SF value of VDQHSD_F
We should have tests for the various sizes that the new pattern covers.
Okay, I'll correct the modes and then write tests for the ones that remain.
Many thanks,
Akram