On Fri, Mar 5, 2021 at 6:00 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Amit Langote <amitlangot...@gmail.com> writes: > > Updated patch attached. > > This claim seems false on its face: > > > All child constraints of a given foreign key constraint can use the > > same RI query and the resulting plan, that is, no need to create as > > many copies of the query and the plan as there are partitions, as > > happens now due to the child constraint OID being used in the key > > for ri_query_cache. > > What if the child tables don't have the same physical column numbers > as the parent? The comment claiming that it's okay if riinfo->fk_attnums > doesn't match seems quite off point, because the query plan is still > going to need to use the correct column numbers. Even if column numbers > are the same, the plan would also contain table and index OIDs that could > only be right for one partition. > > I could imagine translating a parent plan to apply to a child instead of > building it from scratch, but that would take a lot of code we don't have > (there's no rewriteHandler infrastructure for plan nodes). > > Maybe I'm missing something, because I see that the cfbot claims > this is passing, and I'd sure like to think that our test coverage > is not so thin that it'd fail to detect probing the wrong partition > for foreign key matches. But that's what it looks like this patch > will do.
The quoted comment could have been written to be clearer about this, but it's not talking about the table that is to be queried, but the table whose RI trigger is being executed. In all the cases except one (mentioned below), the table that is queried is the same irrespective of which partition's trigger is being executed, so we're basically creating the same plan separately for each partition. create table foo (a int primary key) partition by range (a); create table foo1 partition of foo for values from (minvalue) to (1); create table foo2 partition of foo for values from (1) to (maxvalue); create table bar (a int references foo) partition by range (a); create table bar1 partition of bar for values from (minvalue) to (1); create table bar2 partition of bar for values from (1) to (maxvalue); insert into foo values (0), (1); insert into bar values (0), (1); For the 2nd insert statement, RI_FKey_check() issues the following query irrespective of whether it is called from bar1's or bar2's insert check RI trigger: SELECT 1 FROM foo WHERE foo.a = $1; Likewise for: delete from foo; ri_restrict() issues the following query irrespective of whether called from foo1's or foo2's delete action trigger: SELECT 1 FROM bar WHERE bar.a = $1; The one case I found in which the queried table is the same as the table whose trigger has been invoked is ri_Check_Pk_Match(), in which case, it's actually wrong for partitions to share the plan, so the patch was wrong in that case. Apparently, we didn't test that case with partitioning, so I added one as follows: +-- test that ri_Check_Pk_Match() scans the correct partition for a deferred +-- ON DELETE/UPDATE NO ACTION constraint +CREATE SCHEMA fkpart10 + CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1) + CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1) + CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue) + CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED); +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +INSERT INTO fkpart10.tbl2 VALUES (0), (1); +BEGIN; +DELETE FROM fkpart10.tbl1 WHERE f1 = 0; +UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1; +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +COMMIT; +DROP SCHEMA fkpart10 CASCADE; +NOTICE: drop cascades to 2 other objects +DETAIL: drop cascades to table fkpart10.tbl1 +drop cascades to table fkpart10.tbl2 With the v3 patch, the test case fails with the following error on COMMIT: +ERROR: update or delete on table "tbl1_p2" violates foreign key constraint "tbl2_f1_fkey2" on table "tbl2" +DETAIL: Key (f1)=(1) is still referenced from table "tbl2". That's because in ri_Check_Pk_Match(), partition tbl2 ends up using the same cached plan as tbl1 and so scans tbl1 to check whether the row deleted from tbl2 has reappeared, which is of course wrong. I have fixed that by continuing to use the individual partition's constraint's OID as the query key for this particular query. I have also removed the somewhat confusing comment about fk_attnums. The point it was trying to make is that fk_attnums being possibly different among partitions does not make a difference to what any given shareable query's text ultimately looks like. Those attribute numbers are only used by the macros RIAttName(), RIAttType(), RIAttCollation() when generating the query text and they'd all return the same values no matter which partition's attribute numbers are used. Updated patch attached. -- Amit Langote EDB: http://www.enterprisedb.com
From b433aabbab5f74a4284fabdb078200c6c1e89e57 Mon Sep 17 00:00:00 2001 From: amitlan <amitlangote09@gmail.com> Date: Thu, 23 Apr 2020 21:11:58 +0900 Subject: [PATCH v4] Share SPI plan between partitions in some RI triggers In all cases except one, the RI triggers of leaf partitions that all share a given foreign key constraint issue the same query through SPI, although ri_query_cache stores the plan of the query separately for each partition. That is unnecessary and can even lead to a huge increase in memory allocated to store those plans if the query's target relation is a partitioned table containing many partitions. This commit changes the key used by ri_query_cache in such cases to be the shared foreign key constraint's root OID so that all partitions sharing the constraint also share the plan. However, the query issued by ri_Check_Pk_Match() targets the same table as the one on which the RI trigger is fired, so each leaf partition should get its own query and the associated plan in that one case. This commit adds a test for this case. Keisuke Kuroda, Amit Langote --- src/backend/utils/adt/ri_triggers.c | 78 +++++++++++++++++++++-- src/test/regress/expected/foreign_key.out | 18 ++++++ src/test/regress/sql/foreign_key.sql | 16 +++++ 3 files changed, 106 insertions(+), 6 deletions(-) diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c index 6e3a41062f..db65d61c49 100644 --- a/src/backend/utils/adt/ri_triggers.c +++ b/src/backend/utils/adt/ri_triggers.c @@ -95,13 +95,20 @@ * RI_ConstraintInfo * * Information extracted from an FK pg_constraint entry. This is cached in - * ri_constraint_cache. + * ri_constraint_cache with constraint_root_id as a part of the hash key. */ typedef struct RI_ConstraintInfo { - Oid constraint_id; /* OID of pg_constraint entry (hash key) */ + Oid constraint_id; /* OID of pg_constraint entry */ + Oid constraint_parent; /* OID of parent of this constraint */ + Oid constraint_root_id; /* OID of the constraint in some ancestor + * of the partition (most commonly root + * ancestor) from which this constraint is + * inherited; same as constraint_id if the + * constraint is non-inherited */ bool valid; /* successfully initialized? */ - uint32 oidHashValue; /* hash value of pg_constraint OID */ + uint32 oidHashValue; /* hash value of constraint_id */ + uint32 rootHashValue; /* hash value of constraint_root_id */ NameData conname; /* name of the FK constraint */ Oid pk_relid; /* referenced relation */ Oid fk_relid; /* referencing relation */ @@ -221,6 +228,8 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo, Relation pk_rel, Relation fk_rel, TupleTableSlot *violatorslot, TupleDesc tupdesc, int queryno, bool partgone) pg_attribute_noreturn(); +static Oid get_ri_constraint_root(const RI_ConstraintInfo *riinfo); +static Oid get_ri_constraint_root_recurse(Oid constrOid); /* @@ -1892,7 +1901,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation) * Construct a hashtable key for a prepared SPI plan of an FK constraint. * * key: output argument, *key is filled in based on the other arguments - * riinfo: info from pg_constraint entry + * riinfo: info derived from pg_constraint entry * constr_queryno: an internal number identifying the query type * (see RI_PLAN_XXX constants at head of file) * ---------- @@ -1902,10 +1911,23 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo, int32 constr_queryno) { /* + * For partitions, constraint_root_id stores the OID of the constraint in + * the ancestor from which this constraint has been inherited. For most + * queries, we use that OID as the key instead of the partition's own + * constraint's OID so that all partitions sharing the constraint will + * also share the SPI plan. That's okay to do because the table that is + * queried is the same irrespective of which partition's trigger is being + * executed. The only case in which the table queried is the same as the + * table whose trigger is being executed is when performing + * ri_Check_Pk_Match(), where each partition must have its on own plan. + * * We assume struct RI_QueryKey contains no padding bytes, else we'd need * to use memset to clear them. */ - key->constr_id = riinfo->constraint_id; + if (constr_queryno != RI_PLAN_CHECK_LOOKUPPK_FROM_PK) + key->constr_id = riinfo->constraint_root_id; + else + key->constr_id = riinfo->constraint_id; key->constr_queryno = constr_queryno; } @@ -2009,6 +2031,37 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk) return riinfo; } +/* + * get_ri_constraint_root + * Returns the OID of RI constraint's root parent or its own if not + * inherited + */ +static Oid +get_ri_constraint_root(const RI_ConstraintInfo *riinfo) +{ + if (!OidIsValid(riinfo->constraint_parent)) + return riinfo->constraint_id; + + return get_ri_constraint_root_recurse(riinfo->constraint_parent); +} + +static Oid +get_ri_constraint_root_recurse(Oid constrOid) +{ + HeapTuple tuple; + Oid constrParentOid; + + tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for constraint %u", constrOid); + constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid; + ReleaseSysCache(tuple); + if (OidIsValid(constrParentOid)) + return get_ri_constraint_root_recurse(constrParentOid); + + return constrOid; +} + /* * Fetch or create the RI_ConstraintInfo struct for an FK constraint. */ @@ -2051,8 +2104,12 @@ ri_LoadConstraintInfo(Oid constraintOid) /* And extract data */ Assert(riinfo->constraint_id == constraintOid); + riinfo->constraint_parent = conForm->conparentid; + riinfo->constraint_root_id = get_ri_constraint_root(riinfo); riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID, ObjectIdGetDatum(constraintOid)); + riinfo->rootHashValue = GetSysCacheHashValue1(CONSTROID, + ObjectIdGetDatum(riinfo->constraint_root_id)); memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData)); riinfo->pk_relid = conForm->confrelid; riinfo->fk_relid = conForm->conrelid; @@ -2117,7 +2174,16 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue) RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo, valid_link, iter.cur); - if (hashvalue == 0 || riinfo->oidHashValue == hashvalue) + /* + * We look for changes to two specific pg_constraint entries here -- + * the one matching the constraint given by riinfo->constraint_id and + * also the one given by riinfo->constraint_root_id. The latter too + * because if its parent is updated, it is no longer the root + * constraint. + */ + if (hashvalue == 0 || + riinfo->oidHashValue == hashvalue || + riinfo->rootHashValue == hashvalue) { riinfo->valid = false; /* Remove invalidated entries from the list, too */ diff --git a/src/test/regress/expected/foreign_key.out b/src/test/regress/expected/foreign_key.out index 07bd5b6434..7386f4d635 100644 --- a/src/test/regress/expected/foreign_key.out +++ b/src/test/regress/expected/foreign_key.out @@ -2470,3 +2470,21 @@ DROP SCHEMA fkpart9 CASCADE; NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to table fkpart9.pk drop cascades to table fkpart9.fk +-- test that ri_Check_Pk_Match() scans the correct partition for a deferred +-- ON DELETE/UPDATE NO ACTION constraint +CREATE SCHEMA fkpart10 + CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1) + CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1) + CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue) + CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED); +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +INSERT INTO fkpart10.tbl2 VALUES (0), (1); +BEGIN; +DELETE FROM fkpart10.tbl1 WHERE f1 = 0; +UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1; +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +COMMIT; +DROP SCHEMA fkpart10 CASCADE; +NOTICE: drop cascades to 2 other objects +DETAIL: drop cascades to table fkpart10.tbl1 +drop cascades to table fkpart10.tbl2 diff --git a/src/test/regress/sql/foreign_key.sql b/src/test/regress/sql/foreign_key.sql index c5c9011afc..67aa20435d 100644 --- a/src/test/regress/sql/foreign_key.sql +++ b/src/test/regress/sql/foreign_key.sql @@ -1738,3 +1738,19 @@ DELETE FROM fkpart9.pk WHERE a=35; SELECT * FROM fkpart9.pk; SELECT * FROM fkpart9.fk; DROP SCHEMA fkpart9 CASCADE; + +-- test that ri_Check_Pk_Match() scans the correct partition for a deferred +-- ON DELETE/UPDATE NO ACTION constraint +CREATE SCHEMA fkpart10 + CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1) + CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1) + CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue) + CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED); +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +INSERT INTO fkpart10.tbl2 VALUES (0), (1); +BEGIN; +DELETE FROM fkpart10.tbl1 WHERE f1 = 0; +UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1; +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +COMMIT; +DROP SCHEMA fkpart10 CASCADE; -- 2.24.1