I wrote: > Tomas Vondra <to...@vondra.me> writes: >> Agreed it's a bug on 32-bit machines. Not sure about 64-bits.
> Yeah, I'm not 100% sure about that. It's certainly doing something > unexpected, but we might accidentally end up with relatively-sane > relative distance comparisons anyway. (I assume the outputs will > only be compared to other outputs of the same function, right?) > I have a vague recollection that the IEEE float format was chosen with > an eye to making comparisons cheap, ie not too much different from > integer comparisons. So the sort order might be about the same > even after incorrectly reinterpreting the bit-pattern as an int. > NaNs probably mess that up, but they would anyway. For the archives' sake: I researched this a little more, and verified my recollection wasn't totally inaccurate. If you compare two IEEE-format floats using 2s-complement integer arithmetic, then: * You will get the same less/equal/greater result as the correct floating-point comparison result if both numbers are nonnegative, or if one is nonnegative and the other is negative. * You will get the opposite of the correct result (inverted sort order) if both numbers are negative. * If either number is NaN, things would normally not work in a way comparable to floating-point behavior, but as long as all the NaNs are identically represented ... which they would be ... it'd actually sort the same way we sort NaNs, as larger than plus-infinity. Now that's not exactly what the broken code was doing: it was interpreting the bits as an integer, converting that hallucinated integer to a float, and then (later) subtracting two such floats. However, as long as negative values aren't involved, the float conversion would preserve sort order and thus give distance numbers that are at least topologically sane. Nonetheless, we had better recommend reindexing these indexes even on 64-bit machines. Even if the distances calculated before the fix weren't totally insane, they will be quite a bit different from the distances calculated after the fix. So I fear that even if an index was more or less okay beforehand, it's likely to degrade pretty badly once we start making new merge decisions with a different distance function. regards, tom lane