Re: Bug in brin_minmax_multi_distance_numeric()

Tom Lane Wed, 06 Aug 2025 10:33:24 -0700

I wrote:
> Tomas Vondra <[email protected]> writes:
>> Agreed it's a bug on 32-bit machines. Not sure about 64-bits.


> Yeah, I'm not 100% sure about that.  It's certainly doing something
> unexpected, but we might accidentally end up with relatively-sane
> relative distance comparisons anyway.  (I assume the outputs will
> only be compared to other outputs of the same function, right?)
> I have a vague recollection that the IEEE float format was chosen with
> an eye to making comparisons cheap, ie not too much different from
> integer comparisons.  So the sort order might be about the same
> even after incorrectly reinterpreting the bit-pattern as an int.
> NaNs probably mess that up, but they would anyway.

For the archives' sake: I researched this a little more, and
verified my recollection wasn't totally inaccurate.  If you compare
two IEEE-format floats using 2s-complement integer arithmetic, then:

* You will get the same less/equal/greater result as the correct
floating-point comparison result if both numbers are nonnegative,
or if one is nonnegative and the other is negative.

* You will get the opposite of the correct result (inverted sort
order) if both numbers are negative.

* If either number is NaN, things would normally not work in a
way comparable to floating-point behavior, but as long as all
the NaNs are identically represented ... which they would be ...
it'd actually sort the same way we sort NaNs, as larger than
plus-infinity.

Now that's not exactly what the broken code was doing: it was
interpreting the bits as an integer, converting that hallucinated
integer to a float, and then (later) subtracting two such floats.
However, as long as negative values aren't involved, the float
conversion would preserve sort order and thus give distance
numbers that are at least topologically sane.

Nonetheless, we had better recommend reindexing these indexes
even on 64-bit machines.  Even if the distances calculated before
the fix weren't totally insane, they will be quite a bit different
from the distances calculated after the fix.  So I fear that even
if an index was more or less okay beforehand, it's likely to degrade
pretty badly once we start making new merge decisions with a different
distance function.

                        regards, tom lane

Re: Bug in brin_minmax_multi_distance_numeric()

Reply via email to