On Wed, Mar 13, 2019 at 8:18 PM Heikki Linnakangas <hlinn...@iki.fi> wrote: > > I started to consider rewriting the data structure into something more > like B-tree. Then I remembered that I wrote a data structure pretty much > like that last year already! We discussed that on the "Vacuum: allow > usage of more than 1GB of work mem" thread [2], to replace the current > huge array that holds the dead TIDs during vacuum. > > So I dusted off that patch, and made it more general, so that it can be > used to store arbitrary 64-bit integers, rather than ItemPointers or > BlockNumbers. I then added a rudimentary form of compression to the leaf > pages, so that clusters of nearby values can be stored as an array of > 32-bit integers, or as a bitmap. That would perhaps be overkill, if it > was just to conserve some memory in GiST vacuum, but I think this will > turn out to be a useful general-purpose facility.
I had a quick look at it, so I thought first comments could be helpful. + * If you change this, you must recalculate MAX_INTERVAL_LEVELS, too! + * MAX_INTERNAL_ITEMS ^ MAX_INTERNAL_LEVELS >= 2^64. I think that MAX_INTERVAL_LEVELS was a typo for MAX_INTERNAL_LEVELS, which has probably been renamed to MAX_TREE_LEVELS in this patch. + * with varying levels of "compression". Which one is used depending on the + * values stored. depends on? + if (newitem <= sbs->last_item) + elog(ERROR, "cannot insert to sparse bitset out of order"); Is there any reason to disallow inserting duplicates? AFAICT nothing prevents that in the current code. If that's intended, that probably should be documented. Nothing struck me other than that, that's a pretty nice new lib :)