Hi,

I've been looking at GiST to see if there could be a good way to do
parallel builds - and there might be, if the opclass supports sorted
builds, because then we could parallelize the sort.

But then I noticed we support this mode only for point_ops, because
that's the only opclass with sortsupport procedure. Which mostly makes
sense, because types like geometries, polygons, ... do not have a well
defined order.

Still, we have btree_gist, and I don't think there's much reason to not
support sorted builds for those opclasses, where the gist opclass is
defined on top of btree ordering semantics.

So this patch does that, and the difference (compared to builds with
buiffering=on) can easily be an order of magnitude - at least that's
what my tests show:


test=# create index on test_int8 using gist (a) with (buffering = on);
CREATE INDEX
Time: 578799.450 ms (09:38.799)

test=# create index on test_int8 using gist (a) with (buffering = auto);
CREATE INDEX
Time: 53022.593 ms (00:53.023)


test=# create index on test_uuid using gist (a) with (buffering = on);
CREATE INDEX
Time: 39322.799 ms (00:39.323)

test=# create index on test_uuid using gist (a) with (buffering = auto);
CREATE INDEX
Time: 6466.341 ms (00:06.466)


The WIP patch adds enables this for data types with a usable sortsupport
procedure, which excludes time, timetz, cash, interval, bit, vbit, bool,
enum and macaddr8. I assume time, timetz and macaddr8 could be added,
it's just that I didn't find any existing sortsupport procedure. Same
for cash, but IIRC it's mostly deprecated.

Of course, people probably don't use btree_gist with a single column,
because they could just use btree. It's useful for multi-column GiST
indexes, with data types requiring GiST. And if the other opclasses also
allow sorted builds, this patch makes that possible. Of course, most
"proper GiST opclasses" don't support that, but why not - it's easy.

FWIW this is also why sorted builds likely are not a very practical way
to do parallel builds for GiST - it would help only with a small part of
cases, I think.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From 011e6eee923a6f51668f3277f470d32922923d17 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Fri, 17 May 2024 20:36:08 +0200
Subject: [PATCH v20240517] allow sorted builds for btree_gist

---
 contrib/btree_gist/Makefile                 |  2 +-
 contrib/btree_gist/btree_gist--1.8--1.9.sql | 82 +++++++++++++++++++++
 contrib/btree_gist/btree_gist.control       |  2 +-
 3 files changed, 84 insertions(+), 2 deletions(-)
 create mode 100644 contrib/btree_gist/btree_gist--1.8--1.9.sql

diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index 7ac2df26c10..68190ac5e46 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -34,7 +34,7 @@ DATA = btree_gist--1.0--1.1.sql \
        btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \
        btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
        btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
-       btree_gist--1.7--1.8.sql
+       btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql
 PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
 
 REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
diff --git a/contrib/btree_gist/btree_gist--1.8--1.9.sql b/contrib/btree_gist/btree_gist--1.8--1.9.sql
new file mode 100644
index 00000000000..3fd6f33f41b
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.8--1.9.sql
@@ -0,0 +1,82 @@
+/* contrib/btree_gist/btree_gist--1.8--1.9.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.9'" to load this file. \quit
+
+ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD
+	FUNCTION 11 (oid, oid) btoidsortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
+	FUNCTION 11 (int2, int2) btint2sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
+	FUNCTION 11 (int4, int4) btint4sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
+	FUNCTION 11 (int8, int8) btint8sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD
+	FUNCTION 11 (float4, float4) btfloat4sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD
+	FUNCTION 11 (float8, float8) btfloat8sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD
+	FUNCTION 11 (timestamp, timestamp) timestamp_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
+	FUNCTION 11 (timestamptz, timestamptz) timestamp_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY gist_time_ops USING gist ADD
+-- 	FUNCTION 12 (time, time) gist_stratnum_btree (int2) ;
+
+ALTER OPERATOR FAMILY gist_date_ops USING gist ADD
+	FUNCTION 11 (date, date) date_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD
+-- 	FUNCTION 12 (interval, interval) gist_stratnum_btree (int2) ;
+
+-- ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD
+-- 	FUNCTION 12 (money, money) gist_stratnum_btree (int2) ;
+
+ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD
+	FUNCTION 11 (macaddr, macaddr) macaddr_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_text_ops USING gist ADD
+	FUNCTION 11 (text, text) bttextsortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD
+	FUNCTION 11 (bpchar, bpchar) bpchar_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD
+	FUNCTION 11 (bytea, bytea) bytea_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD
+	FUNCTION 11 (numeric, numeric) numeric_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD
+--	FUNCTION 12 (bit, bit) gist_stratnum_btree (int2) ;
+
+-- ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD
+--	FUNCTION 12 (varbit, varbit) gist_stratnum_btree (int2) ;
+
+ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD
+	FUNCTION 11 (inet, inet) network_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD
+	FUNCTION 11 (cidr, cidr) network_sortsupport (internal) ;
+
+--ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD
+--	FUNCTION 12 (timetz, timetz) gist_stratnum_btree (int2) ;
+
+ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD
+	FUNCTION 11 (uuid, uuid) uuid_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD
+-- 	FUNCTION 12 (macaddr8, macaddr8) gist_stratnum_btree (int2) ;
+
+-- ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD
+-- 	FUNCTION 12 (anyenum, anyenum) gist_stratnum_btree (int2) ;
+
+-- ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD
+-- 	FUNCTION 12 (bool, bool) gist_stratnum_btree (int2) ;
diff --git a/contrib/btree_gist/btree_gist.control b/contrib/btree_gist/btree_gist.control
index abf66538f32..69d9341a0ad 100644
--- a/contrib/btree_gist/btree_gist.control
+++ b/contrib/btree_gist/btree_gist.control
@@ -1,6 +1,6 @@
 # btree_gist extension
 comment = 'support for indexing common datatypes in GiST'
-default_version = '1.8'
+default_version = '1.9'
 module_pathname = '$libdir/btree_gist'
 relocatable = true
 trusted = true
-- 
2.44.0

Reply via email to