Apparently cputube did not pick the last version of the patches I've submitted in December (and I don't see the message in the thread in archive either), so it's listed as broken.
So here we go again, hopefully this time everything will go through ... regards On 12/28/18 12:45 AM, Tomas Vondra wrote: > Hi all, > > Attached is an updated/rebased version of the patch series. There are no > changes to behavior, but let me briefly summarize the current state: > > 0001 and 0002 > ------------- > > The first two parts are "just" refactoring the existing code to pass all > scankeys to the opclass at once - this is needed by the new minmax-like > opclass, but per discussion with Alvaro it seems worthwhile even > independently. I tend to agree with that. Similarly for the second part, > which moves all IS NULL checks entirely to bringetbimap(). > > 0003 bloom opclass > ------------------ > > The first new opclasss, based on bloom filters. For each page range > (i.e. 1MB by default) a small bloom filter is built (with hash values of > the original values as inputs), and then used to evaluate equality > queries. A small optimization is that initially the actual (hash) values > are kept until reaching the bloom filter size. This improves behavior in > low-cardinality data sets. > > Picking the bloom filter parameters is the tricky part - we don't have a > reliable source of such information (namely number of distinct values > per range), and e.g. the false positive rate actually has to be picked > by the user because it's a compromise between index size and accuracy. > Essentially, false positive rate is the fraction of the table that has > to be scanned for a random value (on average). But it also makes the > index larger, because the per-range bloom filters will be larger. > > Another reason why this needs to be defined by the user is that the > space for index tuple is limited by one page (8kB by default), so we > can't allow the bloom filter to be larger (we have to assume it's > non-compressible, because in the optimal fill it's 50% 0s and 1s). But > the BRIN index may be multi-column, and the limit applies to the whole > tuple. And we don't know what the opclasses or parameters of other > columns are. > > So the patch simply adds two reloptions > > a) n_distinct_per_range - number of distinct values per range > b) false_positive_rate - false positive rate of the filter > > There are some simple heuristics to ensure the values are reasonable > (e.g. upper limit for number of distinct values, etc.) and perhaps we > might consider stats from the underlying table (when not empty), but the > patch does not do that. > > > 0004 multi-minmax opclass > ------------------------- > > The second opclass addresses a common issue for minmax indexes, where > the table is initially nicely correlated with the index, and it works > fine. But then deletes/updates route data into other parts of the table > making the ranges very wide ad rendering the BRIN index inefficient. > > One way to deal improve this would be considering the index(es) while > routing the new tuple, i.e. looking not only for page with enough free > space, but for pages in already matching ranges (or close to it). > > A partitioning is a possible approach so segregate the data. But it's > certainly much higher overhead, both in terms of maintenance and > planning (particularly with 1:1 of ranges vs. partitions). > > So the new multi-minmax opclass takes a different approach, replacing > the one minmax range with multiple ranges (64 boundary values or 32 > ranges by default). Initially individual values are stored, and after > reaching the maximum number of values the values are merged into ranges > by distance. This allows handling outliers very efficiently, because > they will not be merged with the "main" range for as long as possible. > > Similarly to the bloom opclass, the main challenge here is deciding the > parameter - in this case, it's "number of values per range". Again, it's > a compromise vs. index size and efficiency. The default (64 values) is > fairly reasonable, but ultimately it's up to the user - there is a new > reloption "values_per_range". > > > > regards > -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
0001-Pass-all-keys-to-BRIN-consistent-function-a-20190223.patch.gz
Description: application/gzip
0002-Move-IS-NOT-NULL-checks-to-bringetbitmap-20190223.patch.gz
Description: application/gzip
0003-BRIN-bloom-indexes-20190223.patch.gz
Description: application/gzip
0004-BRIN-multi-range-minmax-indexes-20190223.patch.gz
Description: application/gzip