On Fri, 30 May 2025, Dmitriy Kovalenko wrote:
If you with "non-performant mobile" mean small in-order cores, most of them can handle repeated
accumulation like these even faster, if you sequence these so that all accumulations to one register is
sequentially. E.g. first all "smlal \u_dst1\().4
I'm sorry for the previous patch it seems to be something happening off with
the corrupted
patch got sent at the outlook step, I'll keep using send-email.
=== __every single__ inline comment response ===
> This is an unrelated change
Fixed and resolved
> The patch adds trailing whitespace here