On 3/9/21 9:51 PM, John Naylor wrote:
> 
> On Sun, Mar 7, 2021 at 8:53 PM Tomas Vondra
> <tomas.von...@enterprisedb.com <mailto:tomas.von...@enterprisedb.com>>
> wrote:
> [v20210308b]
> 
> I managed to trap an assertion that somehow doesn't happen during the
> regression tests. The callers of fill_expanded_ranges() do math like this:
> 
> /* both ranges and points are expanded into a separate element */
> neranges = ranges->nranges + ranges->nvalues;
> 
> but inside fill_expanded_ranges() we have this assertion:
> 
> /* Check that the output array has the right size. */
> Assert(neranges == (2 * ranges->nranges + ranges->nvalues));
> 

Yeah, that assert is bogus. It's calculating the number of boundary
values (and ranges have two), but in ExpandedRanges each we still
represent that as a single element. So the Assert should be just

Assert(neranges == (ranges->nranges + ranges->nvalues));

But maybe it's just an overkill and we don't really need it.

> In the regression test data, a debugging elog() shows that nranges is
> most often zero, so in that case, the math happens to be right either
> way. I can reliably get nranges above zero by running
> 
> update brintest_multi set int8col = int8col - 1;
> 
> a few times, at which point I get the crash.
>

Hmm, so maybe we should do something like this in regression tests too?
It's not good that we don't trigger the "nranges > 0" case at all.

> 
> Aside from that, the new changes look good. Just a couple small things:
> 
> +    allowed parameters.  Only the <literal>bloom</literal> operator class
> +    allows specifying parameters:
> 
> minmax-multi aren't mentioned here, but are mentioned further down.
> 

I forgot to add this bit. Will fix.

> + * Addresses from different families are consider to be in maximum
> 
> (comment above brin_minmax_multi_distance_inet)
> s/consider/considered/
> 

Will fix.

>> > 2) moving minmax/inclusion changes from 0002 to a separate patch 0003
>> >
>> > I think we should either ditch the 0003 (i.e. keep the existing
>> > opclasses unchanged) or commit 0003 (in which case I'd vote to just stop
>> > supporting the old signature of the consistent function).
>> >
>>
>> Still not sure what do to about this. I'm leaning towards keeping 0003
>> and just removing the "old" signature entirely, to keep the API cleaner.
>> It might cause some breakage in out-of-core BRIN opclasses, but that
>> seems like a reasonable price. Moreover, the opclasses may need some
>> updating anyway, because of the changes in handling NULL scan keys (0004
>> moves that from the opclass to the bringetbitmap function).
> 
> Keeping 0003 seems reasonable, given the above.
> 

And do you agree with removing the old signature entirely? That might
break some out-of-core opclasses, but we're not aware of any, and
they'll be broken anyway. Seems fine to me.

>> > The remaining part that didn't get much review is the very last patch,
>> > adding an option to ignore correlation for some BRIN opclases. This is
>> > needed as the regular BRIN costing is quite sensitive to correlation,
>> > and the cost gets way too high for poorly correlated data, making it
>> > unlikely the index will be used. But handling such data sets efficiently
>> > is the main point of those new opclasses. Any opinions on this?
>> >
>>
>> Not sure about this.
> 
> I hadn't given it much thought (nor tested), but I just took a look at
> brincostestimate(). If the table is badly correlated, I'm thinking the
> number of ranges that need to be scanned will increase regardless. Maybe
> rather than ignoring correlation, we could clamp it or otherwise tweak
> it. Not sure about the details, though, that would require some testing.
> 

Well, maybe. In any case we need to do something about this, otherwise
the new opclasses won't be used even in cases where it's perfectly OK.
And it needs to be opclass-dependent, in some way.

I'm pretty sure even the simple examples you've used to test
minmax-multi (with updating a fraction of tuples to low/high value)
would be affected by this.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to