Removing oids with pg_repack
ECT k,v FROM ONLY public.wipe_oid_test WITH NO DATALOG: (query) INSERT INTO repack.table_2273648077 SELECT k,v FROM ONLY public.wipe_oid_testLOG: (query) SELECT repack.disable_autovacuum('repack.table_2273648077')LOG: (query) COMMITLOG: (query) CREATE UNIQUE INDEX index_2273648083 ON repack.table_2273648077 USING btree (k)LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)LOG: (param:0) = SELECT * FROM repack.log_2273648077 ORDER BY id LIMIT $1LOG: (param:1) = INSERT INTO repack.table_2273648077 VALUES ($1.*)LOG: (param:2) = DELETE FROM repack.table_2273648077 WHERE (k) = ($1.k)LOG: (param:3) = UPDATE repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) = ($1.k)LOG: (param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (LOG: (param:5) = 1000LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'virtualxid' AND pid <> pg_backend_pid() AND virtualtransaction = ANY($1)LOG: (param:0) = {}LOG: (query) SAVEPOINT repack_sp1LOG: (query) SET LOCAL statement_timeout = 100LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODELOG: (query) RESET statement_timeoutLOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)LOG: (param:0) = SELECT * FROM repack.log_2273648077 ORDER BY id LIMIT $1LOG: (param:1) = INSERT INTO repack.table_2273648077 VALUES ($1.*)LOG: (param:2) = DELETE FROM repack.table_2273648077 WHERE (k) = ($1.k)LOG: (param:3) = UPDATE repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) = ($1.k)LOG: (param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (LOG: (param:5) = 0LOG: (query) SELECT repack.repack_swap($1)LOG: (param:0) = 2273648077LOG: (query) COMMITLOG: (query) BEGIN ISOLATION LEVEL READ COMMITTEDLOG: (query) SAVEPOINT repack_sp1LOG: (query) SET LOCAL statement_timeout = 100LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODELOG: (query) RESET statement_timeoutLOG: (query) SELECT repack.repack_drop($1, $2)LOG: (param:0) = 2273648077LOG: (param:1) = 4LOG: (query) COMMITLOG: (query) BEGIN ISOLATION LEVEL READ COMMITTEDLOG: (query) ANALYZE public.wipe_oid_testLOG: (query) COMMITLOG: (query) SELECT pg_advisory_unlock($1, CAST(-2147483648 + $2::bigint AS integer))LOG: (param:0) = 16185446LOG: (param:1) = 2273648077 But PostgreSQL still thinks that the table has oids: mydata=# \d+ wipe_oid_test Table "public.wipe_oid_test" Column | Type | Modifiers | Storage | Stats target | Description +--+---+--+--+- k | text | not null | extended | | v | text | | extended | | Indexes: "wipe_oid_test_pkey" PRIMARY KEY, btree (k)Has OIDs: yes mydata=# select oid,* from wipe_oid_test; oid | k | v -+---+--- 0 | 1 | 2 0 | 3 | 4 0 | a | b 0 | c | d(4 rows) I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class. So I'm definitely missing something. I really need a point in the right direction Please help! ;) CG
Re: Removing oids with pg_repack
On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios wrote: Στις 21/11/23 20:41, ο/η CG έγραψε: I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data. But PostgreSQL still thinks that the table has oids: mydata=# \d+ wipe_oid_test Table "public.wipe_oid_test" Column | Type | Modifiers | Storage | Stats target | Description +--+---+--+--+- k | text | not null | extended | | v | text | | extended | | Indexes: "wipe_oid_test_pkey" PRIMARY KEY, btree (k) Has OIDs: yes Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ? It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids. I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class. Just Dont! Noted. ;) So I'm definitely missing something. I really need a point in the right direction Please help! ;) There are a few of methods to get rid of OIDs : - ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that) This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford. - Use table copy + use of a trigger to log changes : https://dba.stackexchange.com/questions/259359/eliminating-oids-while-upgrading-postgresql-from-9-4-to-12 That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations. - Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) :https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/ This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration. For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole. CG -- Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt
Re: Removing oids with pg_repack
On Wednesday, November 22, 2023 at 12:38:54 PM EST, Achilleas Mantzios wrote: Στις 22/11/23 15:14, ο/η CG έγραψε: On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios wrote: Στις 21/11/23 20:41, ο/η CG έγραψε: I have a very large PostgreSQL 9.5 database that still has very large tables with oids. I'm trying to get rid of the oids with as little downtime as possible so I can prep the database for upgrade past PostgreSQL 11. I had a wild idea to mod pg_repack to write a new table without oids. I think it almost works. To test out my idea I made a new table wipe_oid_test with oids. I filled it with a few rows of data. But PostgreSQL still thinks that the table has oids: mydata=# \d+ wipe_oid_test Table "public.wipe_oid_test" Column | Type | Modifiers | Storage | Stats target | Description +--+---+--+--+- k | text | not null | extended | | v | text | | extended | | Indexes: "wipe_oid_test_pkey" PRIMARY KEY, btree (k) Has OIDs: yes Except where does it mention in the pg_repack docs (or source) that it is meant to be used for NO OIDS conversion ? It does not-- I was trying to leverage and tweak the base functionality of pg_repack which sets up triggers and migrates data. I figured if the target table was created without OIDs that when pg_repack did the "swap" operation that the new table would take over with the added bonus of not having oids. I can modify pg_class and set relhasoids = false, but it isn't actually eliminating the oid column. `\d+` will report not report that it has oids, but the oid column is still present and returns the same result before updating pg_class. Just Dont! Noted. ;) So I'm definitely missing something. I really need a point in the right direction Please help! ;) There are a few of methods to get rid of OIDs : - ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that) This makes the database unusable for hours and hours and hours because it locks the table entirely while it performs the operation. That's just something that we can't afford. - Use table copy + use of a trigger to log changes : https://dba.stackexchange.com/questions/259359/eliminating-oids-while-upgrading-postgresql-from-9-4-to-12 That SO is not quite the effect I'm going for. The poster of that SO was using OIDS in their application and needed a solution to maintain those values after conversion. I simply want to eliminate them without the extraordinary downtime the database would experience during ALTER operations. Sorry I meant this one : Stripping OIDs from tables in preparation for pg_upgrade | | | | | | | | | | | Stripping OIDs from tables in preparation for pg_upgrade I have a postgres database in RDS, file size approaching 1TB. We started in 2005, using ruby/activerecord/rails... | | | This is the same idea as the percona ETL strategy, and essentially 90% of what pg_repack already does (creates new tables, sets up triggers, locks the tables, and swaps new for old at the end of the process) - Use of Inheritance (the most neat solution I have seen, this is what I used for a 2TB table conversion) : https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/ This is closest to the effect I was going for. pg_repack essentially creates a second table and fills it with the data from the first table while ensuring standard db operations against that table continue to function while the data is being moved from the old table to the new table. The process outlined in the Percona ETL strategy has to be repeated per-table, which is work I was hoping to avoid by leveraging 95% of the functionality of pg_repack while supplying my own 5% as the resulting table would not have oids regardless of the source table's configuration. For my experiment, Table A did have oids. Table B (created by pg_repack) did not (at least at creation). When the "swap" operation happened in pg_repack, the metadata for Table A was assigned to Table B. I'm just trying to figure out what metadata I need to change in the system tables to reflect the actual table structure. I have the fallback position for the Percona ETL strategy. But I feel like I'm REALLY close with pg_repack and I just don't understand enough about the system internals to nudge it to correctness and need some expert assistance to tap it in the hole. Why don't just inspect the code pg_repack ? I have, and I have modified pg_repack (modification was shown in my first post) to create and write to a new table without oids, the problem is when the "swap" operation happ
Off-label use for pg_repack
Hi fellow list members. I hit a brick wall with my last question. I'd like to try this again. I need to remove OIDs from tables without locking the tables for long periods of time. I have developed a strategy that seems to work, but I would like the experts to weigh in since I'm planning on doing things to the system tables that are generally frowned upon. Prior to running pg_repack I perform these modifications: mydata=# update pg_class set relhasoids = false where oid = 'a_very_large_table_with_oids'::regclass::oid;UPDATE 1mydata=# delete from pg_attribute where attrelid = 'a_very_large_table_with_oids'::regclass::oid and attname = 'oid';DELETE 1mydata=# \d+ a_very_large_table_with_oids; Table "public.a_very_large_table_with_oids" Column | Type | Modifiers | Storage | Stats target | Description +--+---+--+--+- k | text | not null | extended | | v | text | | extended | | Indexes: "a_very_large_table_with_oids_pkey" PRIMARY KEY, btree (k) mydata=# select oid,* from a_very_large_table_with_oids;ERROR: column "oid" does not exist So far so good. I can insert update and delete rows, but the table structure on disk is unchanged. So after those modifications I repack the table with vanilla pg_repack. That copies the data to a fresh new table, sans oids. Before I start performing these operations on-line on the production data I wanted some expert eyes on this process since this is mission critical stuff. On the very large tables we will be in this limbo state for an extended period of time where pg_class and pg_attribute will have those forced modifications while pg_repack works its magic. Is there anything to be concerned about if insert/update/delete seems to be working? My insert/update/deletes on the tables while pg_repack is running seem to work fine. I also tried this on tables that have toast tables attached and upon first glance, everything seems to be in order. What have I missed?