On Fri, Dec 27, 2019 at 11:05 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > John Naylor <john.nay...@2ndquadrant.com> writes: > > On Fri, Dec 27, 2019 at 9:54 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > >> ... but couldn't the > >> right shift be elided in favor of changing the constant we > >> subtract clz's result from? Shifting off those bits separately > >> made sense in the old implementation, but assuming that CLZ is > >> more or less constant-time, it doesn't with CLZ. > > > That makes sense -- I'll look into doing that. > > Actually, we could apply that insight to both code paths. > In the existing path, that requires assuming > ALLOCSET_NUM_FREELISTS+ALLOC_MINBITS <= 17, but that's OK. > (Nowadays I'd probably add a StaticAssert about that.)
I tried that in the attached files and got these results: current 6.14s clz 4.52s clz no right shift 3.15s lookup table 5.56s lookup table no right shift 7.34s Here, "lookup table" refers to using the pg_leftmost_one_pos[] array and incrementing the result. Removing the shift operation from the CLZ case is clearly an improvement, and the main body goes from movabsq $34359738367, %rax addq %rax, %rdi shrq $3, %rdi bsrl %edi, %eax xorl $-32, %eax addl $33, %eax to decl %edi bsrl %edi, %eax xorl $-32, %eax addl $30, %eax The lookup table case is less clear. Removing the shift results in assembly that looks more like the C code and is slower for me. The standard lookup table code uses some magic constants and does its own constant folding by shifting 11 (8 + 3). In the absence of testing on platforms that will actually exercise this path, it seems the open-coded path should keep the shift for now. Thoughts? -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
v2-test_allocsetfreeindex.c
Description: Binary data
v2-test_allocsetfreeindex2.c
Description: Binary data