In https://www.postgresql.org/message-id/15293.1466536...@sss.pgh.pa.us I speculated that it might not take too much to replace all the variants of GIN array_ops with a single polymorphic opclass over anyarray. Attached is a proposed patch that does that.
There are two bits of added functionality needed to make this work: 1. We need to abstract the storage type. The patch does this by teaching catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an opcintype of ANYARRAY, and doing the array element type lookup at index creation time. 2. We need to abstract the key comparator. The patch does this by teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC, it should look up the default btree comparator for the index key type. Both of these seem to me to be reasonable general-purpose behaviors with potential application to other opclasses. In the aforementioned message I worried that a core opclass defined this way might conflict with user-built opclasses for specific array types, but it seems to work out fine without any additional tweaks: CREATE INDEX already prefers an exact match if it finds one, and only falls back to matching anyarray when it doesn't. Also, all the replaced opclasses are presently default for their types, which means that pg_dump won't print them explicitly in CREATE INDEX commands, so we don't have a dump/reload or pg_upgrade hazard from them disappearing. A potential downside is that for an opclass defined this way, we add a lookup_type_cache() call to each initGinState() call. That's basically just a single dynahash lookup once the caches are populated, so it's not much added cost, but conceivably it could be measurable in bulk insert operations. If it does prove objectionable my inclination would be to look into ways to avoid the repetitive function lookups of initGinState, perhaps by letting it cache that stuff in the index's relcache entry. I'll put this on the September commitfest docket. regards, tom lane
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml index 05d92eb..d0d2ba8 100644 *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *************** *** 441,462 **** </para> <para> ! There are three methods that an operator class for <acronym>GIN</acronym> must provide: ! <variablelist> ! <varlistentry> ! <term><function>int compare(Datum a, Datum b)</></term> ! <listitem> ! <para> ! Compares two keys (not indexed items!) and returns an integer less than ! zero, zero, or greater than zero, indicating whether the first key is ! less than, equal to, or greater than the second. Null keys are never ! passed to this function. ! </para> ! </listitem> ! </varlistentry> ! <varlistentry> <term><function>Datum *extractValue(Datum itemValue, int32 *nkeys, bool **nullFlags)</></term> --- 441,450 ---- </para> <para> ! There are two methods that an operator class for <acronym>GIN</acronym> must provide: ! <variablelist> <varlistentry> <term><function>Datum *extractValue(Datum itemValue, int32 *nkeys, bool **nullFlags)</></term> *************** *** 645,651 **** --- 633,670 ---- </listitem> </varlistentry> </variablelist> + </para> + + <para> + In addition, GIN must have a way to sort the key values stored in the index. + The operator class can define the sort ordering by specifying a comparison + method: + <variablelist> + <varlistentry> + <term><function>int compare(Datum a, Datum b)</></term> + <listitem> + <para> + Compares two keys (not indexed items!) and returns an integer less than + zero, zero, or greater than zero, indicating whether the first key is + less than, equal to, or greater than the second. Null keys are never + passed to this function. + </para> + </listitem> + </varlistentry> + </variablelist> + + Alternatively, if the operator class does not provide a <function>compare</> + method, GIN will look up the default btree operator class for the index + key data type, and use its comparison function. It is recommended to + specify the comparison function in a GIN operator class that is meant for + just one data type, as looking up the btree operator class costs a few + cycles. However, polymorphic GIN operator classes (such + as <literal>array_ops</>) typically cannot specify a single comparison + function. + </para> + + <para> Optionally, an operator class for <acronym>GIN</acronym> can supply the following method: diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index a2450f4..8bfa924 100644 *** a/src/backend/access/gin/ginutil.c --- b/src/backend/access/gin/ginutil.c *************** *** 22,28 **** --- 22,30 ---- #include "miscadmin.h" #include "storage/indexfsm.h" #include "storage/lmgr.h" + #include "utils/builtins.h" #include "utils/index_selfuncs.h" + #include "utils/typcache.h" /* *************** initGinState(GinState *state, Relation i *** 104,112 **** origTupdesc->attrs[i]->attcollation); } ! fmgr_info_copy(&(state->compareFn[i]), ! index_getprocinfo(index, i + 1, GIN_COMPARE_PROC), ! CurrentMemoryContext); fmgr_info_copy(&(state->extractValueFn[i]), index_getprocinfo(index, i + 1, GIN_EXTRACTVALUE_PROC), CurrentMemoryContext); --- 106,138 ---- origTupdesc->attrs[i]->attcollation); } ! /* ! * If the compare proc isn't specified in the opclass definition, look ! * up the index key type's default btree comparator. ! */ ! if (index_getprocid(index, i + 1, GIN_COMPARE_PROC) != InvalidOid) ! { ! fmgr_info_copy(&(state->compareFn[i]), ! index_getprocinfo(index, i + 1, GIN_COMPARE_PROC), ! CurrentMemoryContext); ! } ! else ! { ! TypeCacheEntry *typentry; ! ! typentry = lookup_type_cache(origTupdesc->attrs[i]->atttypid, ! TYPECACHE_CMP_PROC_FINFO); ! if (!OidIsValid(typentry->cmp_proc_finfo.fn_oid)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_FUNCTION), ! errmsg("could not identify a comparison function for type %s", ! format_type_be(origTupdesc->attrs[i]->atttypid)))); ! fmgr_info_copy(&(state->compareFn[i]), ! &(typentry->cmp_proc_finfo), ! CurrentMemoryContext); ! } ! ! /* Opclass must always provide extract procs */ fmgr_info_copy(&(state->extractValueFn[i]), index_getprocinfo(index, i + 1, GIN_EXTRACTVALUE_PROC), CurrentMemoryContext); diff --git a/src/backend/access/gin/ginvalidate.c b/src/backend/access/gin/ginvalidate.c index 7518ede..425484e 100644 *** a/src/backend/access/gin/ginvalidate.c --- b/src/backend/access/gin/ginvalidate.c *************** ginvalidate(Oid opclassoid) *** 237,243 **** if (opclassgroup && (opclassgroup->functionset & (((uint64) 1) << i)) != 0) continue; /* got it */ ! if (i == GIN_COMPARE_PARTIAL_PROC) continue; /* optional method */ if (i == GIN_CONSISTENT_PROC || i == GIN_TRICONSISTENT_PROC) continue; /* don't need both, see check below loop */ --- 237,243 ---- if (opclassgroup && (opclassgroup->functionset & (((uint64) 1) << i)) != 0) continue; /* got it */ ! if (i == GIN_COMPARE_PROC || i == GIN_COMPARE_PARTIAL_PROC) continue; /* optional method */ if (i == GIN_CONSISTENT_PROC || i == GIN_TRICONSISTENT_PROC) continue; /* don't need both, see check below loop */ diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 7b30e46..77ba6e5 100644 *** a/src/backend/catalog/index.c --- b/src/backend/catalog/index.c *************** ConstructTupleDescriptor(Relation heapRe *** 437,447 **** keyType = opclassTup->opckeytype; else keyType = amroutine->amkeytype; ReleaseSysCache(tuple); if (OidIsValid(keyType) && keyType != to->atttypid) { - /* index value and heap value have different types */ tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(keyType)); if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for type %u", keyType); --- 437,464 ---- keyType = opclassTup->opckeytype; else keyType = amroutine->amkeytype; + + /* + * If keytype is specified as ANYELEMENT, and opcintype is ANYARRAY, + * then the attribute type must be an array (else it'd not have + * matched this opclass); use its element type. + */ + if (keyType == ANYELEMENTOID && opclassTup->opcintype == ANYARRAYOID) + { + keyType = get_base_element_type(to->atttypid); + if (!OidIsValid(keyType)) + elog(ERROR, "could not get element type of array type %u", + to->atttypid); + } + ReleaseSysCache(tuple); + /* + * If a key type different from the heap value is specified, update + * the type-related fields in the index tupdesc. + */ if (OidIsValid(keyType) && keyType != to->atttypid) { tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(keyType)); if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for type %u", keyType); diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h index 00320b4..6dbbf36 100644 *** a/src/include/catalog/pg_amproc.h --- b/src/include/catalog/pg_amproc.h *************** DATA(insert ( 3550 869 869 9 3573 )); *** 255,410 **** /* gin */ ! DATA(insert ( 2745 1007 1007 1 351 )); ! DATA(insert ( 2745 1007 1007 2 2743 )); ! DATA(insert ( 2745 1007 1007 3 2774 )); ! DATA(insert ( 2745 1007 1007 4 2744 )); ! DATA(insert ( 2745 1007 1007 6 3920 )); ! DATA(insert ( 2745 1009 1009 1 360 )); ! DATA(insert ( 2745 1009 1009 2 2743 )); ! DATA(insert ( 2745 1009 1009 3 2774 )); ! DATA(insert ( 2745 1009 1009 4 2744 )); ! DATA(insert ( 2745 1009 1009 6 3920 )); ! DATA(insert ( 2745 1015 1015 1 360 )); ! DATA(insert ( 2745 1015 1015 2 2743 )); ! DATA(insert ( 2745 1015 1015 3 2774 )); ! DATA(insert ( 2745 1015 1015 4 2744 )); ! DATA(insert ( 2745 1015 1015 6 3920 )); ! DATA(insert ( 2745 1023 1023 1 357 )); ! DATA(insert ( 2745 1023 1023 2 2743 )); ! DATA(insert ( 2745 1023 1023 3 2774 )); ! DATA(insert ( 2745 1023 1023 4 2744 )); ! DATA(insert ( 2745 1023 1023 6 3920 )); ! DATA(insert ( 2745 1561 1561 1 1596 )); ! DATA(insert ( 2745 1561 1561 2 2743 )); ! DATA(insert ( 2745 1561 1561 3 2774 )); ! DATA(insert ( 2745 1561 1561 4 2744 )); ! DATA(insert ( 2745 1561 1561 6 3920 )); ! DATA(insert ( 2745 1000 1000 1 1693 )); ! DATA(insert ( 2745 1000 1000 2 2743 )); ! DATA(insert ( 2745 1000 1000 3 2774 )); ! DATA(insert ( 2745 1000 1000 4 2744 )); ! DATA(insert ( 2745 1000 1000 6 3920 )); ! DATA(insert ( 2745 1014 1014 1 1078 )); ! DATA(insert ( 2745 1014 1014 2 2743 )); ! DATA(insert ( 2745 1014 1014 3 2774 )); ! DATA(insert ( 2745 1014 1014 4 2744 )); ! DATA(insert ( 2745 1014 1014 6 3920 )); ! DATA(insert ( 2745 1001 1001 1 1954 )); ! DATA(insert ( 2745 1001 1001 2 2743 )); ! DATA(insert ( 2745 1001 1001 3 2774 )); ! DATA(insert ( 2745 1001 1001 4 2744 )); ! DATA(insert ( 2745 1001 1001 6 3920 )); ! DATA(insert ( 2745 1002 1002 1 358 )); ! DATA(insert ( 2745 1002 1002 2 2743 )); ! DATA(insert ( 2745 1002 1002 3 2774 )); ! DATA(insert ( 2745 1002 1002 4 2744 )); ! DATA(insert ( 2745 1002 1002 6 3920 )); ! DATA(insert ( 2745 1182 1182 1 1092 )); ! DATA(insert ( 2745 1182 1182 2 2743 )); ! DATA(insert ( 2745 1182 1182 3 2774 )); ! DATA(insert ( 2745 1182 1182 4 2744 )); ! DATA(insert ( 2745 1182 1182 6 3920 )); ! DATA(insert ( 2745 1021 1021 1 354 )); ! DATA(insert ( 2745 1021 1021 2 2743 )); ! DATA(insert ( 2745 1021 1021 3 2774 )); ! DATA(insert ( 2745 1021 1021 4 2744 )); ! DATA(insert ( 2745 1021 1021 6 3920 )); ! DATA(insert ( 2745 1022 1022 1 355 )); ! DATA(insert ( 2745 1022 1022 2 2743 )); ! DATA(insert ( 2745 1022 1022 3 2774 )); ! DATA(insert ( 2745 1022 1022 4 2744 )); ! DATA(insert ( 2745 1022 1022 6 3920 )); ! DATA(insert ( 2745 1041 1041 1 926 )); ! DATA(insert ( 2745 1041 1041 2 2743 )); ! DATA(insert ( 2745 1041 1041 3 2774 )); ! DATA(insert ( 2745 1041 1041 4 2744 )); ! DATA(insert ( 2745 1041 1041 6 3920 )); ! DATA(insert ( 2745 651 651 1 926 )); ! DATA(insert ( 2745 651 651 2 2743 )); ! DATA(insert ( 2745 651 651 3 2774 )); ! DATA(insert ( 2745 651 651 4 2744 )); ! DATA(insert ( 2745 651 651 6 3920 )); ! DATA(insert ( 2745 1005 1005 1 350 )); ! DATA(insert ( 2745 1005 1005 2 2743 )); ! DATA(insert ( 2745 1005 1005 3 2774 )); ! DATA(insert ( 2745 1005 1005 4 2744 )); ! DATA(insert ( 2745 1005 1005 6 3920 )); ! DATA(insert ( 2745 1016 1016 1 842 )); ! DATA(insert ( 2745 1016 1016 2 2743 )); ! DATA(insert ( 2745 1016 1016 3 2774 )); ! DATA(insert ( 2745 1016 1016 4 2744 )); ! DATA(insert ( 2745 1016 1016 6 3920 )); ! DATA(insert ( 2745 1187 1187 1 1315 )); ! DATA(insert ( 2745 1187 1187 2 2743 )); ! DATA(insert ( 2745 1187 1187 3 2774 )); ! DATA(insert ( 2745 1187 1187 4 2744 )); ! DATA(insert ( 2745 1187 1187 6 3920 )); ! DATA(insert ( 2745 1040 1040 1 836 )); ! DATA(insert ( 2745 1040 1040 2 2743 )); ! DATA(insert ( 2745 1040 1040 3 2774 )); ! DATA(insert ( 2745 1040 1040 4 2744 )); ! DATA(insert ( 2745 1040 1040 6 3920 )); ! DATA(insert ( 2745 1003 1003 1 359 )); ! DATA(insert ( 2745 1003 1003 2 2743 )); ! DATA(insert ( 2745 1003 1003 3 2774 )); ! DATA(insert ( 2745 1003 1003 4 2744 )); ! DATA(insert ( 2745 1003 1003 6 3920 )); ! DATA(insert ( 2745 1231 1231 1 1769 )); ! DATA(insert ( 2745 1231 1231 2 2743 )); ! DATA(insert ( 2745 1231 1231 3 2774 )); ! DATA(insert ( 2745 1231 1231 4 2744 )); ! DATA(insert ( 2745 1231 1231 6 3920 )); ! DATA(insert ( 2745 1028 1028 1 356 )); ! DATA(insert ( 2745 1028 1028 2 2743 )); ! DATA(insert ( 2745 1028 1028 3 2774 )); ! DATA(insert ( 2745 1028 1028 4 2744 )); ! DATA(insert ( 2745 1028 1028 6 3920 )); ! DATA(insert ( 2745 1013 1013 1 404 )); ! DATA(insert ( 2745 1013 1013 2 2743 )); ! DATA(insert ( 2745 1013 1013 3 2774 )); ! DATA(insert ( 2745 1013 1013 4 2744 )); ! DATA(insert ( 2745 1013 1013 6 3920 )); ! DATA(insert ( 2745 1183 1183 1 1107 )); ! DATA(insert ( 2745 1183 1183 2 2743 )); ! DATA(insert ( 2745 1183 1183 3 2774 )); ! DATA(insert ( 2745 1183 1183 4 2744 )); ! DATA(insert ( 2745 1183 1183 6 3920 )); ! DATA(insert ( 2745 1185 1185 1 1314 )); ! DATA(insert ( 2745 1185 1185 2 2743 )); ! DATA(insert ( 2745 1185 1185 3 2774 )); ! DATA(insert ( 2745 1185 1185 4 2744 )); ! DATA(insert ( 2745 1185 1185 6 3920 )); ! DATA(insert ( 2745 1270 1270 1 1358 )); ! DATA(insert ( 2745 1270 1270 2 2743 )); ! DATA(insert ( 2745 1270 1270 3 2774 )); ! DATA(insert ( 2745 1270 1270 4 2744 )); ! DATA(insert ( 2745 1270 1270 6 3920 )); ! DATA(insert ( 2745 1563 1563 1 1672 )); ! DATA(insert ( 2745 1563 1563 2 2743 )); ! DATA(insert ( 2745 1563 1563 3 2774 )); ! DATA(insert ( 2745 1563 1563 4 2744 )); ! DATA(insert ( 2745 1563 1563 6 3920 )); ! DATA(insert ( 2745 1115 1115 1 2045 )); ! DATA(insert ( 2745 1115 1115 2 2743 )); ! DATA(insert ( 2745 1115 1115 3 2774 )); ! DATA(insert ( 2745 1115 1115 4 2744 )); ! DATA(insert ( 2745 1115 1115 6 3920 )); ! DATA(insert ( 2745 791 791 1 377 )); ! DATA(insert ( 2745 791 791 2 2743 )); ! DATA(insert ( 2745 791 791 3 2774 )); ! DATA(insert ( 2745 791 791 4 2744 )); ! DATA(insert ( 2745 791 791 6 3920 )); ! DATA(insert ( 2745 1024 1024 1 380 )); ! DATA(insert ( 2745 1024 1024 2 2743 )); ! DATA(insert ( 2745 1024 1024 3 2774 )); ! DATA(insert ( 2745 1024 1024 4 2744 )); ! DATA(insert ( 2745 1024 1024 6 3920 )); ! DATA(insert ( 2745 1025 1025 1 381 )); ! DATA(insert ( 2745 1025 1025 2 2743 )); ! DATA(insert ( 2745 1025 1025 3 2774 )); ! DATA(insert ( 2745 1025 1025 4 2744 )); ! DATA(insert ( 2745 1025 1025 6 3920 )); DATA(insert ( 3659 3614 3614 1 3724 )); DATA(insert ( 3659 3614 3614 2 3656 )); DATA(insert ( 3659 3614 3614 3 3657 )); --- 255,264 ---- /* gin */ ! DATA(insert ( 2745 2277 2277 2 2743 )); ! DATA(insert ( 2745 2277 2277 3 2774 )); ! DATA(insert ( 2745 2277 2277 4 2744 )); ! DATA(insert ( 2745 2277 2277 6 3920 )); DATA(insert ( 3659 3614 3614 1 3724 )); DATA(insert ( 3659 3614 3614 2 3656 )); DATA(insert ( 3659 3614 3614 3 3657 )); diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h index 6c82d94..240219d 100644 *** a/src/include/catalog/pg_opclass.h --- b/src/include/catalog/pg_opclass.h *************** DATA(insert ( 783 box_ops PGNSP PGUI *** 183,218 **** DATA(insert ( 783 point_ops PGNSP PGUID 1029 600 t 603 )); DATA(insert ( 783 poly_ops PGNSP PGUID 2594 604 t 603 )); DATA(insert ( 783 circle_ops PGNSP PGUID 2595 718 t 603 )); ! DATA(insert ( 2742 _int4_ops PGNSP PGUID 2745 1007 t 23 )); ! DATA(insert ( 2742 _text_ops PGNSP PGUID 2745 1009 t 25 )); ! DATA(insert ( 2742 _abstime_ops PGNSP PGUID 2745 1023 t 702 )); ! DATA(insert ( 2742 _bit_ops PGNSP PGUID 2745 1561 t 1560 )); ! DATA(insert ( 2742 _bool_ops PGNSP PGUID 2745 1000 t 16 )); ! DATA(insert ( 2742 _bpchar_ops PGNSP PGUID 2745 1014 t 1042 )); ! DATA(insert ( 2742 _bytea_ops PGNSP PGUID 2745 1001 t 17 )); ! DATA(insert ( 2742 _char_ops PGNSP PGUID 2745 1002 t 18 )); ! DATA(insert ( 2742 _cidr_ops PGNSP PGUID 2745 651 t 650 )); ! DATA(insert ( 2742 _date_ops PGNSP PGUID 2745 1182 t 1082 )); ! DATA(insert ( 2742 _float4_ops PGNSP PGUID 2745 1021 t 700 )); ! DATA(insert ( 2742 _float8_ops PGNSP PGUID 2745 1022 t 701 )); ! DATA(insert ( 2742 _inet_ops PGNSP PGUID 2745 1041 t 869 )); ! DATA(insert ( 2742 _int2_ops PGNSP PGUID 2745 1005 t 21 )); ! DATA(insert ( 2742 _int8_ops PGNSP PGUID 2745 1016 t 20 )); ! DATA(insert ( 2742 _interval_ops PGNSP PGUID 2745 1187 t 1186 )); ! DATA(insert ( 2742 _macaddr_ops PGNSP PGUID 2745 1040 t 829 )); ! DATA(insert ( 2742 _name_ops PGNSP PGUID 2745 1003 t 19 )); ! DATA(insert ( 2742 _numeric_ops PGNSP PGUID 2745 1231 t 1700 )); ! DATA(insert ( 2742 _oid_ops PGNSP PGUID 2745 1028 t 26 )); ! DATA(insert ( 2742 _oidvector_ops PGNSP PGUID 2745 1013 t 30 )); ! DATA(insert ( 2742 _time_ops PGNSP PGUID 2745 1183 t 1083 )); ! DATA(insert ( 2742 _timestamptz_ops PGNSP PGUID 2745 1185 t 1184 )); ! DATA(insert ( 2742 _timetz_ops PGNSP PGUID 2745 1270 t 1266 )); ! DATA(insert ( 2742 _varbit_ops PGNSP PGUID 2745 1563 t 1562 )); ! DATA(insert ( 2742 _varchar_ops PGNSP PGUID 2745 1015 t 1043 )); ! DATA(insert ( 2742 _timestamp_ops PGNSP PGUID 2745 1115 t 1114 )); ! DATA(insert ( 2742 _money_ops PGNSP PGUID 2745 791 t 790 )); ! DATA(insert ( 2742 _reltime_ops PGNSP PGUID 2745 1024 t 703 )); ! DATA(insert ( 2742 _tinterval_ops PGNSP PGUID 2745 1025 t 704 )); DATA(insert ( 403 uuid_ops PGNSP PGUID 2968 2950 t 0 )); DATA(insert ( 405 uuid_ops PGNSP PGUID 2969 2950 t 0 )); DATA(insert ( 403 pg_lsn_ops PGNSP PGUID 3253 3220 t 0 )); --- 183,189 ---- DATA(insert ( 783 point_ops PGNSP PGUID 1029 600 t 603 )); DATA(insert ( 783 poly_ops PGNSP PGUID 2594 604 t 603 )); DATA(insert ( 783 circle_ops PGNSP PGUID 2595 718 t 603 )); ! DATA(insert ( 2742 array_ops PGNSP PGUID 2745 2277 t 2283 )); DATA(insert ( 403 uuid_ops PGNSP PGUID 2968 2950 t 0 )); DATA(insert ( 405 uuid_ops PGNSP PGUID 2969 2950 t 0 )); DATA(insert ( 403 pg_lsn_ops PGNSP PGUID 3253 3220 t 0 ));
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers