On 5 April 2017 at 17:34, Emre Hasegeli <e...@hasegeli.com> wrote: > >> Interested to hear comments on this. > > > I don't have chance to test it right now, but I am sure it would be an > improvement over what we have right now. There is no single correct > equation with so many unknowns we have. > >> *indexTotalCost += (numTuples * *indexSelectivity) * >> (cpu_index_tuple_cost + qual_op_cost); > > Have you preserved this line intentionally? This would make BRIN index scan > cost even higher, but the primary property of BRIN is to be cheap to scan. > Shouldn't bitmap heap scan node take into account of checking the tuples in > its cost?
Good point. That's wrong, but I'm confused at why you kept the: + *indexTotalCost += selec * numTuples * cpu_index_tuple_cost; at all if that's the case. All the BRIN scan is going to do is build a bitmap of the matching ranges found. I've ended up with: + /* + * Charge a small amount per range tuple which we expect to match to. This + * is meant to reflect the costs of manipulating the bitmap. The BRIN scan + * will set a bit for each page in the range when we find a matching + * range, so we must multiply the charge by the number of pages in the + * range. + */ + *indexTotalCost += 0.1 * cpu_operator_cost * estimatedRanges * + pagesPerRange; Which is inspired from cost_bitmap_tree_node(), and seems like a fair cost for setting a single bit. I also noticed that you were doing: + if (get_index_stats_hook && + (*get_index_stats_hook) (root, index->indexoid, 1, &vardata)) and + vardata.statsTuple = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(index->indexoid), + Int16GetDatum(1), I wondered why you picked to hardcode that as 1, and I thought that's surely a bug. But then looking deeper it seems to be copied from btcostestimate(), which also seems to be a long-standing bug introduced in a536ed53bca40cb0d199824e358a86fcfd5db7f2. I'll go write a patch for that one now. I've been looking at the estimates output by the following: create table ab (a int, b int); insert into ab select x/1000,x%1000 from generate_series(1,1000000)x; create index ab_a_idx on ab using brin (a) with (pages_per_range = 128); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 64); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 32); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 16); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 8); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 4); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 2); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; create index ab_a_idx on ab using brin (a) with (pages_per_range = 1); analyze ab; explain analyze select * from ab where a = 1; drop index ab_a_idx; and I see that the bitmap index scan gets more expensive the more ranges that are in the index, which of course makes sense. The small bitmap maintenance cost I added should stay about the same, since I multiplied it by the pagesPerRange, it may only drop slightly as it may expect to set fewer bits in the bitmap due to the ranges being more specific to the tuples in the heap. The row estimates seem pretty sane there too, based on how many were filtered out on the recheck. I do still have concerns about how the correlation value is used, and the fact that we use the highest value, but then the whole use of this value is far from perfect, and is just a rough indication of how things are. Apart from that, I'm pretty happy with this now. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
brin-correlation-drowley_v2.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers