RE: [PATCH] aarch64: Improve Advanced SIMD popcount expansion by using SVE [PR113860]

2024-07-31 Thread Pengxuan Zheng (QUIC)
> Sorry for the slow review. > > Pengxuan Zheng writes: > > This patch improves the Advanced SIMD popcount expansion by using SVE > > if available. > > > > For example, GCC currently generates the following code sequence for V2DI: > > cnt v31.16b, v31.16b > > uaddlp v31.8h, v31.16b > >

Re: [PATCH] aarch64: Improve Advanced SIMD popcount expansion by using SVE [PR113860]

2024-07-29 Thread Richard Sandiford
Sorry for the slow review. Pengxuan Zheng writes: > This patch improves the Advanced SIMD popcount expansion by using SVE if > available. > > For example, GCC currently generates the following code sequence for V2DI: > cnt v31.16b, v31.16b > uaddlp v31.8h, v31.16b > uaddlp v31.4s, v31

[PATCH] aarch64: Improve Advanced SIMD popcount expansion by using SVE [PR113860]

2024-07-17 Thread Pengxuan Zheng
This patch improves the Advanced SIMD popcount expansion by using SVE if available. For example, GCC currently generates the following code sequence for V2DI: cnt v31.16b, v31.16b uaddlp v31.8h, v31.16b uaddlp v31.4s, v31.8h uaddlp v31.2d, v31.4s However, by using SVE, we can gener