On Mon, Sep 2, 2019 at 9:28 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > On 02/09/2019 07:54, Alexander Korotkov wrote: > > NULL and '(NaN,NaN)' are swapped. It happens so, because we assume > > distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to > > be greater than NULL. If even we would assume distance to NULL to be > > Nan, it doesn't guarantee that NULLs would be last. It looks like we > > can handle this only by introduction array of "distance is null" flags > > to GISTSearchItem. But does it worth it? > > I don't think we have much choice, returning wrong results is not an > option. At first I thought we could set the "recheck" flag if we see > NULLs or NaNs, but that won't work because rechecking is not supported > with Index-Only Scans. > > Looking at the corresponding SP-GiST code, > pairingheap_SpGistSearchItem_cmp() gets this right. I'd suggest > copy-pasting the implementation from there, so that they would be as > similar as possible. (SP-GiST gets away with just one isNull flag, > because it doesn't support multi-column indexes. In GiST, you need an > array, as you said.)
Thank you for clarifying my doubts. Second of attached patches solves this problem. In this patch GISTSearchItem have both array of distances and array of null flags at the end. They are accessed by a bit cumbersome GISTSearchItemDistanceValues() and GISTSearchItemDistanceNulls() macros. I came up to this, because I didn't like to allocate null flags or distances in separate chunk of memory. Also I had idea to wrap distance and null flag into struct, but I didn't like to change signature of index_store_float8_orderby_distances() that much. I'm going to push both if no objections. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
0001-gist-pairing-heap-cmp-fix-2.patch
Description: Binary data
0002-knn-gist-fix-null-distances-2.patch
Description: Binary data