On Fri, Sep 6, 2019 at 2:13 PM Wilco Dijkstra <wilco.dijks...@arm.com> wrote: > > Hi, > > +(simplify > + (convert > + (rshift > + (mult > > > is the outer convert really necessary? That is, if we change > > the simplification result to > > Indeed that should be "convert?" to make it optional.
Rather drop it, a generated conversion should be elided by conversion simplification. > > Is the Hamming weight popcount > > faster than the libgcc table-based approach? I wonder if we really > > need to restrict this conversion to the case where the target > > has an expander. > > Well libgcc uses the exact same sequence (not a table): > > objdump -d ./aarch64-unknown-linux-gnu/libgcc/_popcountsi2.o > > 0000000000000000 <__popcountdi2>: > 0: d341fc01 lsr x1, x0, #1 > 4: b200c3e3 mov x3, #0x101010101010101 // > #72340172838076673 > 8: 9200f021 and x1, x1, #0x5555555555555555 > c: cb010001 sub x1, x0, x1 > 10: 9200e422 and x2, x1, #0x3333333333333333 > 14: d342fc21 lsr x1, x1, #2 > 18: 9200e421 and x1, x1, #0x3333333333333333 > 1c: 8b010041 add x1, x2, x1 > 20: 8b411021 add x1, x1, x1, lsr #4 > 24: 9200cc20 and x0, x1, #0xf0f0f0f0f0f0f0f > 28: 9b037c00 mul x0, x0, x3 > 2c: d378fc00 lsr x0, x0, #56 > 30: d65f03c0 ret > > So if you don't check for an expander you get an endless loop in libgcc since > the makefile doesn't appear to use -fno-builtin anywhere... Hm, must be aarch specific. But indeed it should use -fno-builtin ... Richard. > > Wilco >