On Tue, Mar 3, 2020 at 4:46 AM Jesse Zhang <sbje...@gmail.com> wrote: > The math in the new (from v4 to v6) patch is wrong: it yields > ceil_log2(1) = 1 or next_power_of_2(1) = 2.
I think you're right. > I can see that you lifted > the restriction of "num greater than one" for ceil_log2() in this patch > set, but it's now _more_ problematic to base those functions on > pg_leftmost_one_pos(). > I'm not comfortable with your changes to pg_leftmost_one_pos() to remove > the restriction on word being non-zero. Specifically > pg_leftmost_one_pos() is made to return 0 on 0 input. While none of its > current callers (in HEAD) is harmed, this introduces muddy semantics: > > 1. pg_leftmost_one_pos is semantically undefined on 0 input: scanning > for a set bit in a zero word won't find it anywhere. Right. > I've quickly put together a PoC patch on top of yours, which > re-implements ceil_log2 using LZCNT coupled with a CPUID check. > Thoughts? This patch seems to be making an assumption that an indirect function call is faster than taking a branch (in inlined code) that the CPU will almost always predict correctly. It would be nice to have some numbers to compare. (against pg_count_leading_zeros_* using the "slow" versions but statically inlined). Stylistically, "8 * sizeof(num)" is a bit overly formal, since the hard-coded number we want is in the name of the function. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services