v2di scalar multiply for NEON32

husseydevin at gmail dot com Mon, 31 Dec 2018 17:16:49 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88510


Devin Hussey <husseydevin at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|GCC generates inefficient   |GCC generates inefficient
                   |U64x2 scalar multiply for   |U64x2/v2di scalar multiply
                   |NEON32                      |for NEON32

--- Comment #1 from Devin Hussey <husseydevin at gmail dot com> ---
I noticed that the scalarization is performed in the veclower21 stage. 

In making a patch for LLVM, I found that the x86 code could basically be
copy-pasted over, just adding truncates and replacing the SSE instructions with
NEON instructions. I would add it if someone told me where the SSE code is and
where to put the NEON code. That is what helped me with the LLVM patch.

[Bug target/88510] GCC generates inefficient U64x2/v2di scalar multiply for NEON32

Reply via email to