On Mon, Jul 24, 2017 at 8:28 PM, Balbir Singh <bsinghar...@gmail.com> wrote: > On Mon, Jul 24, 2017 at 11:01 AM, Matt Brown > <matthew.brown....@gmail.com> wrote: >> This adds emulations for the popcntb, popcntw, and popcntd instructions. >> Tested for correctness against the popcnt{b,w,d} instructions on ppc64le. >> >> Signed-off-by: Matt Brown <matthew.brown....@gmail.com> >> --- >> v2: >> - fixed opcodes >> - fixed typecasting >> - fixed bitshifting error for both 32 and 64bit arch >> --- >> arch/powerpc/lib/sstep.c | 43 ++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 42 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c >> index 87d277f..e6a16a3 100644 >> --- a/arch/powerpc/lib/sstep.c >> +++ b/arch/powerpc/lib/sstep.c >> @@ -612,6 +612,35 @@ static nokprobe_inline void do_cmpb(struct pt_regs >> *regs, unsigned long v1, >> regs->gpr[rd] = out_val; >> } >> >> +/* >> + * The size parameter is used to adjust the equivalent popcnt instruction. >> + * popcntb = 8, popcntw = 32, popcntd = 64 >> + */ >> +static nokprobe_inline void do_popcnt(struct pt_regs *regs, unsigned long >> v1, >> + int size, int ra) >> +{ >> + unsigned long long high, low, mask; >> + unsigned int n; >> + int i, j; >> + >> + high = 0; >> + low = 0; >> + >> + for (i = 0; i < (64 / size); i++) { >> + n = 0; >> + for (j = 0; j < size; j++) { >> + mask = 1UL << (j + (i * size)); >> + if (v1 & mask) >> + n++; >> + } >> + if ((i * size) < 32) >> + low |= n << (i * size); >> + else >> + high |= n << ((i * size) - 32); >> + } >> + regs->gpr[ra] = (high << 32) | low; >> +} > > There's a way to do it in very efficient way via the Giles-Miller > method of side-ways addition > > Please see > > http://opensourceforu.com/2012/06/power-programming-bitwise-tips-tricks/ > and lib/hweight.c, you can reuse the code from lib/hweight.c
Oh that's a really cool technique. We could use that for the parity instructions too. > > Balbir Singh