On Thu, 2020-07-30 at 19:16 +0200, Tomas Vondra wrote: > > Essentially: > > initHyperLogLog(&hll, 5) > > for i in 0 .. one billion > > addHyperLogLog(&hll, hash(i)) > > estimateHyperLogLog > > > > The numbers are the same regardless of bwidth. > > > > Before my patch, it takes about 15.6s. After my patch, it takes > > about > > 6.6s, so it's more than a 2X speedup (including the hash > > calculation). > > > > Wow. That's a huge improvements.
To be clear: the 2X+ speedup was on the tight loop test. > How does the whole test (data + query) look like? Is it particularly > rare / special case, or something reasonable to expect in practice? The whole-query test was: config: shared_buffers=8GB jit = off max_parallel_workers_per_gather=0 setup: create table t_1m_20(i int); vacuum (freeze, analyze) t_1m_20; insert into t_1m_20 select (random()*1000000)::int4 from generate_series(1,20000000); query: set work_mem='2048kB'; SELECT pg_prewarm('t_1m_20', 'buffer'); -- median of the three runs select distinct i from t_1m_20 offset 10000000; select distinct i from t_1m_20 offset 10000000; select distinct i from t_1m_20 offset 10000000; results: f2130e77 (before using HLL): 6787ms f1af75c5 (before my recent commit): 7170ms fd734f38 (master now): 6990ms My previous results before I committed the patch (and therefore not on the same exact SHA1s) were 6812, 7158, and 6898. So my most recent batch of results is slightly worse, but the most recent commit (fd734f38) still does show an improvement of a couple percent. Regards, Jeff Davis