It looks like updating pg_constraint isn't an option for AWS RDS due to the way AWS doesn't give you superuser access. Thanks a lot for the suggestion anyway.
On Fri, 15 Nov 2024 at 15:28, Torsten Förtsch <tfoertsch...@gmail.com> wrote: > On Fri, Nov 15, 2024 at 9:38 AM Philip Couling <coul...@gmail.com> wrote: > >> Is there a solid reason why adding a check constraint does not use >> existing indexes for validation. >> >> We are currently looking at partitioning a multi TB table leaving all >> existing data in place and simply attaching it as a partition to a new >> table. To prevent locking, we are trying to add an INVALID check constraint >> first and then validate it. >> >> I can trivially prove the invalid constraint is valid with a simple >> SELECT which will use an existing index and return instantaneously. But >> AFAIK Theres no way to mark a constraint as valid without scanning all the >> rows. >> > > Most likely your query is not exactly the same as the check constraint. > Think about NULL and similar. > > >> This operation is really problematic on a production database with heavy >> IO load. >> >> Is there a solid ready why validating check constraints cannot use >> existing indexes? If I can prove the constraint is valid so trivially with >> a SELECT, then why can Postgres not do the same (or similar)? >> > > Here is what has worked for me many times: > > 1. create the check constraint as NOT VALID. From now on no new or updated > row is allowed to violate it. > 2. check if the constraint holds with a query on a binary replica. Make > sure the query starts only when the constraint is visible on the replica. > 3. on the master: UPDATE pg_constraint SET conisvalidated=true WHERE > conname='your_constraint_name' AND conrelid='schema.table'::regclass > > How you perform step 2 is irrelevant. Checking it on a replica would > simply avoid the load on the master. You just need to make sure there is no > conflicting data in the table. > > WARNING, you need to be damn sure of your data if you do that. But if you > are, it works. > > Here is the procedure how I solved the same problem for some of our > multi-TB tables (PG14): > > The table has a column called transaction_time. We wanted to partition by > that column. For some historical reason the column did not have a NOT NULL > constraint. However, there was no way our processes could insert NULL in > that column and there was no row with NULL in that field. So, first was to > add the NOT NULL constraint: > > BEGIN; > > ALTER TABLE my.table > ADD CONSTRAINT transaction_time_not_null CHECK(transaction_time IS NOT > NULL) NOT VALID; > > UPDATE pg_constraint > SET convalidated=true > WHERE conname = 'transaction_time_not_null' > AND conrelid = 'my.table'::REGCLASS > RETURNING conname, conrelid::REGCLASS, convalidated; > > COMMIT; > > Now for cosmetic purposes we first turn the check constraint above into a > normal NOT NULL constraint: > > BEGIN; > > SET LOCAL client_min_messages = 'debug4'; > -- expecting this message > -- DEBUG: existing constraints on column "table.transaction_time" are > sufficient to prove that it does not contain nulls > ALTER TABLE my.table > ALTER COLUMN transaction_time SET NOT NULL; > RESET client_min_messages; > > ALTER TABLE my.table > DROP CONSTRAINT transaction_time_not_null; > > COMMIT; > > If you set client_min_messages to something like debug4, then the database > tells you if it wants to scan the table or if existing constraints are > sufficient to prove the condition. > > transaction_time in our case is never in the future. Also database > transactions are short-lived. Nothing exceeds 30 seconds. So, I can trust > the effect of any action performed more than 30 seconds ago in the database > is visible. > > So, I set the time after which new rows go to the new partition at least > 10 minutes from now at the next hour boundary. 30 seconds would be good > enough. I chose 10 minutes just for extra safety. > > SELECT date_trunc('hour', now()+'1h 10min'::interval)::timestamp AS > switch_time\gset > > Next comes the actual change: > > BEGIN; > > -- rename the existing table > ALTER TABLE my.table RENAME TO table_old; > > -- drop triggers. We will recreate them later. > DROP TRIGGER ... ON my.table_old; > DROP TRIGGER ...; > > -- create partitioned table > CREATE TABLE my.table ( > LIKE my.table_old > INCLUDING DEFAULTS > INCLUDING CONSTRAINTS > ) > PARTITION BY RANGE (transaction_time); > > -- recreate triggers > CREATE TRIGGER ... > BEFORE DELETE ON my.table > FOR EACH ROW EXECUTE FUNCTION my,trigger_fun(); > CREATE TRIGGER ...; > > -- create the partition for future rows > CREATE TABLE my.table_current PARTITION OF my.table > FOR VALUES FROM (:'switch_time') TO ('infinity'); > > -- and some indexes and FK > ALTER TABLE my.table_current ADD PRIMARY KEY (id); > CREATE INDEX ON my.table_current (...); > ALTER TABLE my.table_current > ADD CONSTRAINT fk_name > FOREIGN KEY (...) > REFERENCES ...(...) ON UPDATE ... ON DELETE ...; > > -- now attach the old table as partition > ALTER TABLE my.table_old > ADD CONSTRAINT partition_boundaries > CHECK('-infinity'<=transaction_time AND transaction_time<:'switch_time') > NOT VALID; > > -- for procedural reasons we know the constraint is valid. Let's make PG > believe it too. > UPDATE pg_constraint > SET convalidated=true > WHERE conname='partition_boundaries' > AND conrelid='my.table_old'::REGCLASS > RETURNING conname, conrelid::REGCLASS, convalidated; > > -- now attach it. We use again debug4 to check if the table scan is > avoided. > SET LOCAL client_min_messages = 'debug4'; > -- expecting > -- DEBUG: partition constraint for table "table_old" is implied by > existing constraints > ALTER TABLE my.table ATTACH PARTITION my.table_old > FOR VALUES FROM ('-infinity') TO (:'switch_time'); > RESET client_min_messages; > > -- drop the now unnecessary constraint > ALTER TABLE my.table_old > DROP CONSTRAINT partition_boundaries; > > COMMIT; > > Once the new partition gets too full, we will use a similar procedure to > adjust the partition boundary of the new partition and then create the next > partition. >