On 7/6/24 05:04, Andrey M. Borodin wrote:>> On 5 Jul 2024, at 23:46, Paul Jungwirth
<p...@illuminatedcomputing.com> wrote:
this commit adds support for all combinations of int2/int4/int8 for all five btree
operators (</<=/=/>=/>).
Looks like a nice feature to have.
Would it make sense to do something similar to float8? Or, perhaps, some other
types from btree_gist?
Here is another patch adding float4/8 and also date/timestamp/timestamptz, in the same combinations
as btree.
No other types seem like they deserve this treatment. For example btree doesn't
mix oids with ints.
Also, are we sure such tests will be stable?
You're right, it was questionable. We hadn't analyzed the table, and after doing that the plan
changes from a bitmap scan to an index-only scan. That makes more sense, and I doubt it will change
now that it's based on statistics.
Yours,
--
Paul ~{:-)
p...@illuminatedcomputing.com
From c3b5f0c135b1d430f9ebf1367644f70726f5bb41 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" <p...@illuminatedcomputing.com>
Date: Fri, 5 Jul 2024 11:16:09 -0700
Subject: [PATCH v2] Use GiST index with mixed integer/float/timestamp widths
A GiST index on a bigint will not be used when compared to an int
literal, unless the user casts the literal to a bigint. This is
surprising and causes invisible slowness. To fix that we can add pg_amop
entries for mixed-width arguments. We do that for all int2/int4/int8
combinations, and also for float4/float8 and date/timestamp/timestamptz
(not that timestamptz is wider than timestamp, but we can follow btree
in allowing mixed operators for them too).
With btree, we have one opfamily for all integer types, and it supports
=(int2,int8), etc. With GiST we have a separate opfamily for each width,
so it's less obvious where to put the mixed-width operators. But it
makes sense for wider opfamilies to include support for smaller types,
so I added =(int2,int8) and =(int4,int8) to gist_int8_ops, and
=(int2,int4) to gist_int4_ops. Also =(float4,float8) to
gist_float8_ops, =(date,timestamptz) and =(timestamp,timestamptz to
gist_timestamptz_ops, and =(date,timestamp) to gist_timestamp_ops. And
as long as we're doing this, we might as well support all five btree
operators (<, <=, =, >=, >).
---
contrib/btree_gist/Makefile | 3 +-
contrib/btree_gist/btree_gist--1.7--1.8.sql | 101 ++++++++++++++++++++
contrib/btree_gist/btree_gist.control | 2 +-
contrib/btree_gist/expected/float8.out | 12 +++
contrib/btree_gist/expected/int4.out | 12 +++
contrib/btree_gist/expected/int8.out | 21 ++++
contrib/btree_gist/expected/timestamp.out | 12 +++
contrib/btree_gist/expected/timestamptz.out | 22 +++++
contrib/btree_gist/meson.build | 1 +
contrib/btree_gist/sql/float8.sql | 8 ++
contrib/btree_gist/sql/int4.sql | 7 ++
contrib/btree_gist/sql/int8.sql | 10 ++
contrib/btree_gist/sql/timestamp.sql | 7 ++
contrib/btree_gist/sql/timestamptz.sql | 11 +++
14 files changed, 227 insertions(+), 2 deletions(-)
create mode 100644 contrib/btree_gist/btree_gist--1.7--1.8.sql
diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index 073dcc745c4..965f0559d64 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -33,7 +33,8 @@ EXTENSION = btree_gist
DATA = btree_gist--1.0--1.1.sql \
btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \
btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
- btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql
+ btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
+ btree_gist--1.7--1.8.sql
PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
diff --git a/contrib/btree_gist/btree_gist--1.7--1.8.sql b/contrib/btree_gist/btree_gist--1.7--1.8.sql
new file mode 100644
index 00000000000..221df3cd6f8
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.7--1.8.sql
@@ -0,0 +1,101 @@
+/* contrib/btree_gist/btree_gist--1.6--1.7.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.7'" to load this file. \quit
+
+-- Add mixed integer operators.
+-- This lets Postgres use the index without casting literals to bigints, etc.
+-- Whereas btree has one integer_ops opfamily,
+-- GiST has gist_int2_ops, gist_int4_ops, and gist_int8_ops.
+-- We add the mixed operators to whichever opfamily matches the larger type,
+-- sort of like "promoting".
+ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
+ OPERATOR 1 < (int8, int4),
+ OPERATOR 2 <= (int8, int4),
+ OPERATOR 3 = (int8, int4),
+ OPERATOR 4 >= (int8, int4),
+ OPERATOR 5 > (int8, int4),
+
+ OPERATOR 1 < (int4, int8),
+ OPERATOR 2 <= (int4, int8),
+ OPERATOR 3 = (int4, int8),
+ OPERATOR 4 >= (int4, int8),
+ OPERATOR 5 > (int4, int8),
+
+ OPERATOR 1 < (int8, int2),
+ OPERATOR 2 <= (int8, int2),
+ OPERATOR 3 = (int8, int2),
+ OPERATOR 4 >= (int8, int2),
+ OPERATOR 5 > (int8, int2),
+
+ OPERATOR 1 < (int2, int8),
+ OPERATOR 2 <= (int2, int8),
+ OPERATOR 3 = (int2, int8),
+ OPERATOR 4 >= (int2, int8),
+ OPERATOR 5 > (int2, int8);
+
+ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
+ OPERATOR 1 < (int4, int2),
+ OPERATOR 2 <= (int4, int2),
+ OPERATOR 3 = (int4, int2),
+ OPERATOR 4 >= (int4, int2),
+ OPERATOR 5 > (int4, int2),
+
+ OPERATOR 1 < (int2, int4),
+ OPERATOR 2 <= (int2, int4),
+ OPERATOR 3 = (int2, int4),
+ OPERATOR 4 >= (int2, int4),
+ OPERATOR 5 > (int2, int4);
+
+-- Add mixed floating point operators.
+ALTER OPERATOR FAMILY gist_float8_ops USING GIST ADD
+ OPERATOR 1 < (float8, float4),
+ OPERATOR 2 <= (float8, float4),
+ OPERATOR 3 = (float8, float4),
+ OPERATOR 4 >= (float8, float4),
+ OPERATOR 5 > (float8, float4),
+
+ OPERATOR 1 < (float4, float8),
+ OPERATOR 2 <= (float4, float8),
+ OPERATOR 3 = (float4, float8),
+ OPERATOR 4 >= (float4, float8),
+ OPERATOR 5 > (float4, float8);
+
+-- Add mixed date/time operators.
+ALTER OPERATOR FAMILY gist_timestamptz_ops USING GIST ADD
+ OPERATOR 1 < (timestamptz, timestamp),
+ OPERATOR 2 <= (timestamptz, timestamp),
+ OPERATOR 3 = (timestamptz, timestamp),
+ OPERATOR 4 >= (timestamptz, timestamp),
+ OPERATOR 5 > (timestamptz, timestamp),
+
+ OPERATOR 1 < (timestamp, timestamptz),
+ OPERATOR 2 <= (timestamp, timestamptz),
+ OPERATOR 3 = (timestamp, timestamptz),
+ OPERATOR 4 >= (timestamp, timestamptz),
+ OPERATOR 5 > (timestamp, timestamptz),
+
+ OPERATOR 1 < (timestamptz, date),
+ OPERATOR 2 <= (timestamptz, date),
+ OPERATOR 3 = (timestamptz, date),
+ OPERATOR 4 >= (timestamptz, date),
+ OPERATOR 5 > (timestamptz, date),
+
+ OPERATOR 1 < (date, timestamptz),
+ OPERATOR 2 <= (date, timestamptz),
+ OPERATOR 3 = (date, timestamptz),
+ OPERATOR 4 >= (date, timestamptz),
+ OPERATOR 5 > (date, timestamptz);
+
+ALTER OPERATOR FAMILY gist_timestamp_ops USING GIST ADD
+ OPERATOR 1 < (timestamp, date),
+ OPERATOR 2 <= (timestamp, date),
+ OPERATOR 3 = (timestamp, date),
+ OPERATOR 4 >= (timestamp, date),
+ OPERATOR 5 > (timestamp, date),
+
+ OPERATOR 1 < (date, timestamp),
+ OPERATOR 2 <= (date, timestamp),
+ OPERATOR 3 = (date, timestamp),
+ OPERATOR 4 >= (date, timestamp),
+ OPERATOR 5 > (date, timestamp);
diff --git a/contrib/btree_gist/btree_gist.control b/contrib/btree_gist/btree_gist.control
index fa9171a80a2..abf66538f32 100644
--- a/contrib/btree_gist/btree_gist.control
+++ b/contrib/btree_gist/btree_gist.control
@@ -1,6 +1,6 @@
# btree_gist extension
comment = 'support for indexing common datatypes in GiST'
-default_version = '1.7'
+default_version = '1.8'
module_pathname = '$libdir/btree_gist'
relocatable = true
trusted = true
diff --git a/contrib/btree_gist/expected/float8.out b/contrib/btree_gist/expected/float8.out
index ebd0ef3d689..304748bb35d 100644
--- a/contrib/btree_gist/expected/float8.out
+++ b/contrib/btree_gist/expected/float8.out
@@ -89,3 +89,15 @@ SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
-1769.73634 | 120.26366000000007
(3 rows)
+SET enable_seqscan=on;
+ANALYZE float8tmp;
+-- It should use the index with a different float width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM float8tmp WHERE a = real '42';
+ QUERY PLAN
+----------------------------------------------------
+ Aggregate
+ -> Index Only Scan using float8idx on float8tmp
+ Index Cond: (a = '42'::real)
+(3 rows)
+
diff --git a/contrib/btree_gist/expected/int4.out b/contrib/btree_gist/expected/int4.out
index 6bbdc7c3f4b..e091ed23386 100644
--- a/contrib/btree_gist/expected/int4.out
+++ b/contrib/btree_gist/expected/int4.out
@@ -89,3 +89,15 @@ SELECT a, a <-> '237' FROM int4tmp ORDER BY a <-> '237' LIMIT 3;
228 | 9
(3 rows)
+SET enable_seqscan=on;
+ANALYZE int4tmp;
+-- It should use the index with a different integer width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int4tmp WHERE a = smallint '42';
+ QUERY PLAN
+------------------------------------------------
+ Aggregate
+ -> Index Only Scan using int4idx on int4tmp
+ Index Cond: (a = '42'::smallint)
+(3 rows)
+
diff --git a/contrib/btree_gist/expected/int8.out b/contrib/btree_gist/expected/int8.out
index eff77c26b5a..04549a94b52 100644
--- a/contrib/btree_gist/expected/int8.out
+++ b/contrib/btree_gist/expected/int8.out
@@ -89,3 +89,24 @@ SELECT a, a <-> '464571291354841' FROM int8tmp ORDER BY a <-> '464571291354841'
478227196042750 | 13655904687909
(3 rows)
+SET enable_seqscan=on;
+ANALYZE int8tmp;
+-- It should use the index with a different integer width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int8tmp WHERE a = integer '42';
+ QUERY PLAN
+------------------------------------------------
+ Aggregate
+ -> Index Only Scan using int8idx on int8tmp
+ Index Cond: (a = 42)
+(3 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int8tmp WHERE a = smallint '42';
+ QUERY PLAN
+------------------------------------------------
+ Aggregate
+ -> Index Only Scan using int8idx on int8tmp
+ Index Cond: (a = '42'::smallint)
+(3 rows)
+
diff --git a/contrib/btree_gist/expected/timestamp.out b/contrib/btree_gist/expected/timestamp.out
index 0d94f2f245c..14d05d24eab 100644
--- a/contrib/btree_gist/expected/timestamp.out
+++ b/contrib/btree_gist/expected/timestamp.out
@@ -89,3 +89,15 @@ SELECT a, a <-> '2004-10-26 08:55:08' FROM timestamptmp ORDER BY a <-> '2004-10-
Mon Nov 29 20:12:43 2004 | @ 34 days 11 hours 17 mins 35 secs
(3 rows)
+SET enable_seqscan=on;
+ANALYZE timestamptmp;
+-- It should use the index when comparing to a date:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptmp WHERE a = date '2018-12-18';
+ QUERY PLAN
+----------------------------------------------------------
+ Aggregate
+ -> Index Only Scan using timestampidx on timestamptmp
+ Index Cond: (a = '12-18-2018'::date)
+(3 rows)
+
diff --git a/contrib/btree_gist/expected/timestamptz.out b/contrib/btree_gist/expected/timestamptz.out
index 75a15a42568..ed58f9690d5 100644
--- a/contrib/btree_gist/expected/timestamptz.out
+++ b/contrib/btree_gist/expected/timestamptz.out
@@ -209,3 +209,25 @@ SELECT a, a <-> '2018-12-18 10:59:54 GMT+2' FROM timestamptztmp ORDER BY a <-> '
Thu Jan 24 12:28:12 2019 PST | @ 37 days 7 hours 28 mins 18 secs
(3 rows)
+SET enable_seqscan=on;
+ANALYZE timestamptztmp;
+-- It should use the index when comparing to a timestamp:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptztmp WHERE a = timestamp '2018-12-18 10:59:54 GMT+3';
+ QUERY PLAN
+-----------------------------------------------------------------------------------
+ Aggregate
+ -> Index Only Scan using timestamptzidx on timestamptztmp
+ Index Cond: (a = 'Tue Dec 18 10:59:54 2018'::timestamp without time zone)
+(3 rows)
+
+-- It should use the index when comparing to a date:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptztmp WHERE a = date '2018-12-18';
+ QUERY PLAN
+--------------------------------------------------------------
+ Aggregate
+ -> Index Only Scan using timestamptzidx on timestamptztmp
+ Index Cond: (a = '12-18-2018'::date)
+(3 rows)
+
diff --git a/contrib/btree_gist/meson.build b/contrib/btree_gist/meson.build
index c88a6ac84ae..4b148edebce 100644
--- a/contrib/btree_gist/meson.build
+++ b/contrib/btree_gist/meson.build
@@ -50,6 +50,7 @@ install_data(
'btree_gist--1.4--1.5.sql',
'btree_gist--1.5--1.6.sql',
'btree_gist--1.6--1.7.sql',
+ 'btree_gist--1.7--1.8.sql',
kwargs: contrib_data_args,
)
diff --git a/contrib/btree_gist/sql/float8.sql b/contrib/btree_gist/sql/float8.sql
index e1e819b37f9..e03261399ea 100644
--- a/contrib/btree_gist/sql/float8.sql
+++ b/contrib/btree_gist/sql/float8.sql
@@ -35,3 +35,11 @@ SELECT count(*) FROM float8tmp WHERE a > -1890.0::float8;
EXPLAIN (COSTS OFF)
SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
SELECT a, a <-> '-1890.0' FROM float8tmp ORDER BY a <-> '-1890.0' LIMIT 3;
+
+SET enable_seqscan=on;
+ANALYZE float8tmp;
+
+-- It should use the index with a different float width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM float8tmp WHERE a = real '42';
+
diff --git a/contrib/btree_gist/sql/int4.sql b/contrib/btree_gist/sql/int4.sql
index 659ab5ee24b..bb7e5865486 100644
--- a/contrib/btree_gist/sql/int4.sql
+++ b/contrib/btree_gist/sql/int4.sql
@@ -35,3 +35,10 @@ SELECT count(*) FROM int4tmp WHERE a > 237::int4;
EXPLAIN (COSTS OFF)
SELECT a, a <-> '237' FROM int4tmp ORDER BY a <-> '237' LIMIT 3;
SELECT a, a <-> '237' FROM int4tmp ORDER BY a <-> '237' LIMIT 3;
+
+SET enable_seqscan=on;
+ANALYZE int4tmp;
+
+-- It should use the index with a different integer width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int4tmp WHERE a = smallint '42';
diff --git a/contrib/btree_gist/sql/int8.sql b/contrib/btree_gist/sql/int8.sql
index 51e55e9c14b..99f89605dd5 100644
--- a/contrib/btree_gist/sql/int8.sql
+++ b/contrib/btree_gist/sql/int8.sql
@@ -35,3 +35,13 @@ SELECT count(*) FROM int8tmp WHERE a > 464571291354841::int8;
EXPLAIN (COSTS OFF)
SELECT a, a <-> '464571291354841' FROM int8tmp ORDER BY a <-> '464571291354841' LIMIT 3;
SELECT a, a <-> '464571291354841' FROM int8tmp ORDER BY a <-> '464571291354841' LIMIT 3;
+
+SET enable_seqscan=on;
+ANALYZE int8tmp;
+
+-- It should use the index with a different integer width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int8tmp WHERE a = integer '42';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int8tmp WHERE a = smallint '42';
diff --git a/contrib/btree_gist/sql/timestamp.sql b/contrib/btree_gist/sql/timestamp.sql
index 95effebfc47..2598647efc2 100644
--- a/contrib/btree_gist/sql/timestamp.sql
+++ b/contrib/btree_gist/sql/timestamp.sql
@@ -35,3 +35,10 @@ SELECT count(*) FROM timestamptmp WHERE a > '2004-10-26 08:55:08'::timestamp;
EXPLAIN (COSTS OFF)
SELECT a, a <-> '2004-10-26 08:55:08' FROM timestamptmp ORDER BY a <-> '2004-10-26 08:55:08' LIMIT 3;
SELECT a, a <-> '2004-10-26 08:55:08' FROM timestamptmp ORDER BY a <-> '2004-10-26 08:55:08' LIMIT 3;
+
+SET enable_seqscan=on;
+ANALYZE timestamptmp;
+
+-- It should use the index when comparing to a date:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptmp WHERE a = date '2018-12-18';
diff --git a/contrib/btree_gist/sql/timestamptz.sql b/contrib/btree_gist/sql/timestamptz.sql
index f70caa4a649..f76e87d51bd 100644
--- a/contrib/btree_gist/sql/timestamptz.sql
+++ b/contrib/btree_gist/sql/timestamptz.sql
@@ -78,3 +78,14 @@ SELECT count(*) FROM timestamptztmp WHERE a > '2018-12-18 10:59:54 GMT+4'::time
EXPLAIN (COSTS OFF)
SELECT a, a <-> '2018-12-18 10:59:54 GMT+2' FROM timestamptztmp ORDER BY a <-> '2018-12-18 10:59:54 GMT+2' LIMIT 3;
SELECT a, a <-> '2018-12-18 10:59:54 GMT+2' FROM timestamptztmp ORDER BY a <-> '2018-12-18 10:59:54 GMT+2' LIMIT 3;
+
+SET enable_seqscan=on;
+ANALYZE timestamptztmp;
+
+-- It should use the index when comparing to a timestamp:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptztmp WHERE a = timestamp '2018-12-18 10:59:54 GMT+3';
+
+-- It should use the index when comparing to a date:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM timestamptztmp WHERE a = date '2018-12-18';
--
2.42.0