Strange behavior with polygon and NaN

Jesse Zhang Wed, 24 Jun 2020 15:12:24 -0700

Hi hackers,

While working with Chris Hajas on merging Postgres 12 with Greenplum
Database we stumbled upon the following strange behavior in the geometry
type polygon:


------ >8 --------

CREATE TEMP TABLE foo (p point);
CREATE INDEX ON foo USING gist(p);

INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');

SELECT $q$
SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
$q$ AS qry \gset

BEGIN;
SAVEPOINT yolo;
SET LOCAL enable_seqscan TO off;
:qry;

ROLLBACK TO SAVEPOINT yolo;
SET LOCAL enable_indexscan TO off;
SET LOCAL enable_bitmapscan TO off;
:qry;

------ 8< --------

If you run the above repro SQL in HEAD (and 12, and likely all older
versions), you get the following output:

CREATE TABLE
CREATE INDEX
INSERT 0 3
BEGIN
SAVEPOINT
SET
   p
-------
 (0,0)
 (1,1)
(2 rows)

ROLLBACK
SET
SET
     p
-----------
 (0,0)
 (1,1)
 (NaN,NaN)
(3 rows)


At first glance, you'd think this is the gist AM's bad, but on a second
thought, something else is strange here. The following query returns
true:

SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'

The above behavior of the "contained in" operator is surprising, and
it's probably not what the GiST AM is expecting. I took a look at
point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
distnace operator for polygon. It gives the following interesting
output:

SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
FROM (
  SELECT circle(point(100 * i, 'NaN'), 50) AS c
  FROM generate_series(-2, 4) i
) t(c)
ORDER BY 2;

        c        | distance
-----------------+----------
 <(-200,NaN),50> |        0
 <(-100,NaN),50> |        0
 <(0,NaN),50>    |        0
 <(100,NaN),50>  |        0
 <(200,NaN),50>  |      NaN
 <(300,NaN),50>  |      NaN
 <(400,NaN),50>  |      NaN
(7 rows)

Should they all be NaN? Am I alone in thinking the index is right but
the operators are wrong? Or should we call the indexes wrong here?

Cheers,
Jesse and Chris

Strange behavior with polygon and NaN

Reply via email to