On Sat, Apr 13, 2024 at 9:13 PM jian he <jian.universal...@gmail.com> wrote:
>
> > > > Here is a patch implementing this. It was a bit more fuss than I 
> > > > expected, so maybe someone has a
> > > > better way.
> I think I found a simple way.
>
> the logic is:
> * ATExecAlterColumnType changes one column once at a time.
> * one column can only have one collation. so we don't need  to store a
> list of collation oid.
> * ATExecAlterColumnType we can get the new collation (targetcollid)
> and original collation info.
> * RememberAllDependentForRebuilding  will check the column dependent,
> whether this column is referenced by a foreign key or not information
> is recorded.
> so AlteredTableInfo->changedConstraintOids have the primary key and
> foreign key oids.
> * ATRewriteCatalogs will call ATPostAlterTypeCleanup (see the comments
> in ATRewriteCatalogs)
> * for tab->changedConstraintOids (foreign key, primary key) will call
> ATPostAlterTypeParse, so
> for foreign key (con->contype == CONSTR_FOREIGN)  will call 
> TryReuseForeignKey.
> * in TryReuseForeignKey, we can pass the information that our primary
> key old collation is nondeterministic
> and old collation != new collation to the foreign key constraint.
> so foreign key can act accordingly at ATAddForeignKeyConstraint 
> (old_check_ok).
>
>
> based on the above logic, I add one bool in struct AlteredTableInfo,
> one bool in struct Constraint.
> bool in AlteredTableInfo is for storing it, later passing it to struct
> Constraint.
> we need bool in Constraint because ATAddForeignKeyConstraint needs it.

I refactored the comments.
also added some extra tests hoping to make it more bullet proof, maybe
it's redundant.
From 9a2e73231e037c7555449d8790885c1feb4f3ed0 Mon Sep 17 00:00:00 2001
From: jian he <jian.universal...@gmail.com>
Date: Fri, 7 Jun 2024 14:32:55 +0800
Subject: [PATCH v4 1/1] Revalidate PK-FK ties when PK collation is changed

With deterministic collations, we check the PK-FK ties by looking at binary
equality. But nondeterministic collation can allow non-identical values
to be considered equal, e.g. 'a' and 'A' when case-insensitive.
ALTER COLUMN .. SET DATA TYPE make
some references that used to be valid may not be anymore.

for `ALTER COLUMN .. SET DATA TYPE`:
If the changed key column is part of the primary key columns,
then under the following conditions we need to revalidate the PK-FK ties
* the PK new collation is different from the old
* the old collation is nondeterministic.

we need to revalidate the PK-FK ties.
discussion: https://postgr.es/m/78d824e0-b21e-480d-a252-e4b84bc2c24b%40illuminatedcomputing.com
---
 src/backend/commands/tablecmds.c              | 61 +++++++++++++++----
 src/include/nodes/parsenodes.h                |  3 +-
 .../regress/expected/collate.icu.utf8.out     | 18 ++++++
 src/test/regress/sql/collate.icu.utf8.sql     | 16 +++++
 4 files changed, 85 insertions(+), 13 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7b6c69b7..77337b6f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -199,6 +199,7 @@ typedef struct AlteredTableInfo
 	/* Objects to rebuild after completing ALTER TYPE operations */
 	List	   *changedConstraintOids;	/* OIDs of constraints to rebuild */
 	List	   *changedConstraintDefs;	/* string definitions of same */
+	bool		verify_new_collation; /* T if we need recheck the new collation */
 	List	   *changedIndexOids;	/* OIDs of indexes to rebuild */
 	List	   *changedIndexDefs;	/* string definitions of same */
 	char	   *replicaIdentityIndex;	/* index to reset as REPLICA IDENTITY */
@@ -572,12 +573,13 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
 								   LOCKMODE lockmode);
 static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
 								 char *cmd, List **wqueue, LOCKMODE lockmode,
-								 bool rewrite);
+								 bool rewrite,
+								 bool verify_new_collation);
 static void RebuildConstraintComment(AlteredTableInfo *tab, AlterTablePass pass,
 									 Oid objid, Relation rel, List *domname,
 									 const char *conname);
 static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
-static void TryReuseForeignKey(Oid oldId, Constraint *con);
+static void TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
 													 List *options, LOCKMODE lockmode);
 static void change_owner_fix_column_acls(Oid relationOid,
@@ -9857,6 +9859,15 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 									old_pfeqop_item);
 		}
 		if (old_check_ok)
+		{
+			/*
+			 * collation_recheck is true then we need to
+			 * revalidate the foreign key and primary key value ties.
+			 * see also TryReuseForeignKey.
+			*/
+			old_check_ok = !fkconstraint->collation_recheck;
+		}
+		if (old_check_ok)
 		{
 			Oid			old_fktype;
 			Oid			new_fktype;
@@ -9902,10 +9913,6 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 			 * turn conform to the domain.  Consequently, we need not treat
 			 * domains specially here.
 			 *
-			 * Since we require that all collations share the same notion of
-			 * equality (which they do, because texteq reduces to bitwise
-			 * equality), we don't compare collation here.
-			 *
 			 * We need not directly consider the PK type.  It's necessarily
 			 * binary coercible to the opcintype of the unique index column,
 			 * and ri_triggers.c will only deal with PK datums in terms of
@@ -13035,6 +13042,29 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
 	/* And the collation */
 	targetcollid = GetColumnDefCollation(NULL, def, targettype);
 
+	if (OidIsValid(targetcollid))
+	{
+		Oid			typid;
+		int32		typmod;
+		Oid			source_collid;
+
+		get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum,
+							&typid, &typmod, &source_collid);
+		/*
+		* All deterministic collations use bitwise equality to resolve
+		* PK-FK ties. But if the primary key (source) collation was indeterministic,
+		* and ALTER COLUMN .. SET DATA TYPE changes the primary key collation, then
+		* we must re-check the PK-FK ties, because some references
+		* that used to be "equal" may not be anymore.
+		* We add a flag in AlteredTableInfo to signify
+		* that we need to verify the new collation for PK-FK ties.
+		* see ATAddForeignKeyConstraint also.
+		*/
+		if (OidIsValid(source_collid) &&
+				!get_collation_isdeterministic(source_collid) &&
+				(targetcollid != source_collid))
+			tab->verify_new_collation = true;
+	}
 	/*
 	 * If there is a default expression for the column, get it and ensure we
 	 * can coerce it to the new datatype.  (We must do this before changing
@@ -13742,7 +13772,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
 
 		ATPostAlterTypeParse(oldId, relid, confrelid,
 							 (char *) lfirst(def_item),
-							 wqueue, lockmode, tab->rewrite);
+							 wqueue, lockmode, tab->rewrite,
+							 tab->verify_new_collation);
 	}
 	forboth(oid_item, tab->changedIndexOids,
 			def_item, tab->changedIndexDefs)
@@ -13753,7 +13784,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
 		relid = IndexGetRelation(oldId, false);
 		ATPostAlterTypeParse(oldId, relid, InvalidOid,
 							 (char *) lfirst(def_item),
-							 wqueue, lockmode, tab->rewrite);
+							 wqueue, lockmode, tab->rewrite,
+							 tab->verify_new_collation);
 
 		ObjectAddressSet(obj, RelationRelationId, oldId);
 		add_exact_object_address(&obj, objects);
@@ -13769,7 +13801,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
 		relid = StatisticsGetRelation(oldId, false);
 		ATPostAlterTypeParse(oldId, relid, InvalidOid,
 							 (char *) lfirst(def_item),
-							 wqueue, lockmode, tab->rewrite);
+							 wqueue, lockmode, tab->rewrite,
+							 tab->verify_new_collation);
 
 		ObjectAddressSet(obj, StatisticExtRelationId, oldId);
 		add_exact_object_address(&obj, objects);
@@ -13832,7 +13865,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
  */
 static void
 ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
-					 List **wqueue, LOCKMODE lockmode, bool rewrite)
+					 List **wqueue, LOCKMODE lockmode, bool rewrite,
+					 bool verify_new_collation)
 {
 	List	   *raw_parsetree_list;
 	List	   *querytree_list;
@@ -13959,7 +13993,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
 					/* rewriting neither side of a FK */
 					if (con->contype == CONSTR_FOREIGN &&
 						!rewrite && tab->rewrite == 0)
-						TryReuseForeignKey(oldId, con);
+						TryReuseForeignKey(oldId, con, verify_new_collation);
 					con->reset_default_tblspc = true;
 					cmd->subtype = AT_ReAddConstraint;
 					tab->subcmds[AT_PASS_OLD_CONSTR] =
@@ -14119,7 +14153,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
  * this constraint can be skipped.
  */
 static void
-TryReuseForeignKey(Oid oldId, Constraint *con)
+TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation)
 {
 	HeapTuple	tup;
 	Datum		adatum;
@@ -14151,6 +14185,9 @@ TryReuseForeignKey(Oid oldId, Constraint *con)
 		con->old_conpfeqop = lappend_oid(con->old_conpfeqop, rawarr[i]);
 
 	ReleaseSysCache(tup);
+
+	/* verify_new_collation is computed at ATExecAlterColumnType */
+	con->collation_recheck = verify_new_collation;
 }
 
 /*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ddfed02d..29b484f5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2773,7 +2773,8 @@ typedef struct Constraint
 	List	   *old_conpfeqop;	/* pg_constraint.conpfeqop of my former self */
 	Oid			old_pktable_oid;	/* pg_constraint.confrelid of my former
 									 * self */
-
+	bool		collation_recheck; /* does primary key columns collation changes need
+									* foreign key constraint recheck */
 	ParseLoc	location;		/* token location, or -1 if unknown */
 } Constraint;
 
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb44..ea28031b 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1933,6 +1933,24 @@ SELECT * FROM test11fk;
 ---
 (0 rows)
 
+-- Test altering the collation of the referenced column.
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+ERROR:  insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL:  Key (x)=(A) is not present in table "pktable".
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accents;
+ERROR:  insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL:  Key (x)=(A) is not present in table "pktable".
+-- should ok:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE case_insensitive;
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accent_case;
+DROP TABLE fktable, pktable;
 -- partitioning
 CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
 CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97..dcdbbc48 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -721,6 +721,22 @@ DELETE FROM test11pk WHERE x = 'abc';
 SELECT * FROM test11pk;
 SELECT * FROM test11fk;
 
+-- Test altering the collation of the referenced column.
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accents;
+
+-- should ok:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE case_insensitive;
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accent_case;
+DROP TABLE fktable, pktable;
 -- partitioning
 CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
 CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
-- 
2.34.1

Reply via email to