Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-08-04 Thread Jeff Law via Gcc-patches
On 8/1/23 00:47, Robin Dapp via Gcc-patches wrote:  I'm not against continuing with the more well-known approach for now  but we should keep in mind that might still be potential for improvement. No. I don't think it's faster. I did a quick check on my x86 laptop and it's roughly 25% fas

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
>>> I'm not against continuing with the more well-known approach for now >>> but we should keep in mind that might still be potential for improvement. > > No. I don't think it's faster. I did a quick check on my x86 laptop and it's roughly 25% faster there. That's consistent with the literature.

Re: Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread 钟居哲
7;s meaningless. Thanks. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2023-08-01 03:38 To: Juzhe-Zhong; gcc-patches CC: rdapp.gcc; kito.cheng; kito.cheng Subject: Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization Hi Juzhe, > +/* Expand Vector POPCOUNT by parallel popcnt:

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
> +/* FIXME: We don't allow vectorize "__builtin_popcountll" yet since it needs > "vec_pack_trunc" support > + and such pattern may cause inferior codegen. > + We will enable "vec_pack_trunc" when we support reasonable vector > cost model. */ Wait, why do we need vec_pack_trunc f

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Expand Vector POPCOUNT by parallel popcnt: > + > + int parallel_popcnt(uint32_t n) { > + #define POW2(c) (1U << (c)) > + #define MASK(c) (static_cast(-1) / (POW2(POW2(c)) + 1U)) > + #define COUNT(x, c) ((x) & MASK(c)) + (((x)>>(POW2(c))) & MASK(c)) > + n = CO