Removing oids with pg_repack

2023-11-21 Thread CG
ECT k,v 
FROM ONLY public.wipe_oid_test WITH NO DATALOG: (query) INSERT INTO 
repack.table_2273648077 SELECT k,v FROM ONLY public.wipe_oid_testLOG: (query) 
SELECT repack.disable_autovacuum('repack.table_2273648077')LOG: (query) 
COMMITLOG: (query) CREATE UNIQUE INDEX index_2273648083 ON 
repack.table_2273648077 USING btree (k)LOG: (query) SELECT 
repack.repack_apply($1, $2, $3, $4, $5, $6)LOG:  (param:0) = SELECT * FROM 
repack.log_2273648077 ORDER BY id LIMIT $1LOG:  (param:1) = INSERT INTO 
repack.table_2273648077 VALUES ($1.*)LOG:  (param:2) = DELETE FROM 
repack.table_2273648077 WHERE (k) = ($1.k)LOG:  (param:3) = UPDATE 
repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) = ($1.k)LOG:  
(param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (LOG:  (param:5) = 
1000LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'virtualxid' AND pid 
<> pg_backend_pid() AND virtualtransaction = ANY($1)LOG:  (param:0) = {}LOG: 
(query) SAVEPOINT repack_sp1LOG: (query) SET LOCAL statement_timeout = 100LOG: 
(query) LOCK TABLE public.wipe_oid_test IN ACCESS EXCLUSIVE MODELOG: (query) 
RESET statement_timeoutLOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, 
$5, $6)LOG:  (param:0) = SELECT * FROM repack.log_2273648077 ORDER BY id LIMIT 
$1LOG:  (param:1) = INSERT INTO repack.table_2273648077 VALUES ($1.*)LOG:  
(param:2) = DELETE FROM repack.table_2273648077 WHERE (k) = ($1.k)LOG:  
(param:3) = UPDATE repack.table_2273648077 SET (k, v) = ($2.k, $2.v) WHERE (k) 
= ($1.k)LOG:  (param:4) = DELETE FROM repack.log_2273648077 WHERE id IN (LOG:  
(param:5) = 0LOG: (query) SELECT repack.repack_swap($1)LOG:  (param:0) = 
2273648077LOG: (query) COMMITLOG: (query) BEGIN ISOLATION LEVEL READ 
COMMITTEDLOG: (query) SAVEPOINT repack_sp1LOG: (query) SET LOCAL 
statement_timeout = 100LOG: (query) LOCK TABLE public.wipe_oid_test IN ACCESS 
EXCLUSIVE MODELOG: (query) RESET statement_timeoutLOG: (query) SELECT 
repack.repack_drop($1, $2)LOG:  (param:0) = 2273648077LOG:  (param:1) = 4LOG: 
(query) COMMITLOG: (query) BEGIN ISOLATION LEVEL READ COMMITTEDLOG: (query) 
ANALYZE public.wipe_oid_testLOG: (query) COMMITLOG: (query) SELECT 
pg_advisory_unlock($1, CAST(-2147483648 + $2::bigint AS integer))LOG:  
(param:0) = 16185446LOG:  (param:1) = 2273648077

But PostgreSQL still thinks that the table has oids:
mydata=# \d+ wipe_oid_test                   Table "public.wipe_oid_test" 
Column | Type | Modifiers | Storage  | Stats target | Description 
+--+---+--+--+- k      | 
text | not null  | extended |              |  v      | text |           | 
extended |              | Indexes:    "wipe_oid_test_pkey" PRIMARY KEY, btree 
(k)Has OIDs: yes
mydata=# select oid,* from wipe_oid_test; oid | k | v -+---+---   0 | 1 | 2 
  0 | 3 | 4   0 | a | b   0 | c | d(4 rows)
I can modify pg_class and set relhasoids = false, but it isn't actually 
eliminating the oid column. `\d+` will report not report that it has oids, but 
the oid column is still present and returns the same result before updating 
pg_class.
So I'm definitely missing something. I really need a point in the right 
direction Please help! ;)
CG





Re: Removing oids with pg_repack

2023-11-22 Thread CG
 

On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios 
 wrote:  
 
  Στις 21/11/23 20:41, ο/η CG έγραψε:
  
 
 I have a very large PostgreSQL 9.5 database that still has very large tables 
with oids. I'm trying to get rid of the oids with as little downtime as 
possible so I can prep the database for upgrade past PostgreSQL 11. I had a 
wild idea to mod pg_repack to write a new table without oids. I think it almost 
works.  
  To test out my idea I made a new table wipe_oid_test with oids. I filled it 
with a few rows of data. 
  
  But PostgreSQL still thinks that the table has oids: 
mydata=# \d+ wipe_oid_test                    Table "public.wipe_oid_test"  
Column | Type | Modifiers | Storage  | Stats target | Description  
+--+---+--+--+-  k      | 
text | not null  | extended |              |   v      | text |           | 
extended |              |  Indexes:     "wipe_oid_test_pkey" PRIMARY KEY, btree 
(k) Has OIDs: yes
 Except where does it mention in the pg_repack docs (or source) that it is 
meant to be used for NO OIDS conversion ?
It does not-- I was trying to leverage and tweak the base functionality of 
pg_repack which sets up triggers and migrates data. I figured if the target 
table was created without OIDs that when pg_repack did the "swap" operation 
that the new table would take over with the added bonus of not having oids.
 
 I can modify pg_class and set relhasoids = false, but it isn't actually 
eliminating the oid column. `\d+` will report not report that it has oids, but 
the oid column is still present and returns the same result before updating 
pg_class. 
   
 Just Dont!
Noted. ;)
 
  So I'm definitely missing something. I really need a point in the right 
direction Please help! ;) 
   
 
There are a few of methods to get rid of OIDs :
 - ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)
 
This makes the database unusable for hours and hours and hours because it locks 
the table entirely while it performs the operation. That's just something that 
we can't afford.
 - Use table copy +  use of a trigger to log changes : 
https://dba.stackexchange.com/questions/259359/eliminating-oids-while-upgrading-postgresql-from-9-4-to-12
That SO is not quite the effect I'm going for. The poster of that SO was using 
OIDS in their application and needed a solution to maintain those values after 
conversion. I simply want to eliminate them without the extraordinary downtime 
the database would experience during ALTER operations.
 
- Use of Inheritance (the most neat solution I have seen, this is what I used 
for a 2TB table conversion) 
:https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/
 This is closest to the effect I was going for. pg_repack essentially creates a 
second table and fills it with the data from the first table while ensuring 
standard db operations against that table continue to function while the data 
is being moved from the old table to the new table. The process outlined in the 
Percona ETL strategy has to be repeated per-table, which is work I was hoping 
to avoid by leveraging 95% of the functionality of pg_repack while supplying my 
own 5% as the resulting table would not have oids regardless of the source 
table's configuration.
For my experiment, Table A did have oids. Table B (created by pg_repack) did 
not (at least at creation). When the "swap" operation happened in pg_repack, 
the metadata for Table A was assigned to Table B. I'm just trying to figure out 
what metadata I need to change in the system tables to reflect the actual table 
structure. 
I have the fallback position for the Percona ETL strategy. But I feel like I'm 
REALLY close with pg_repack and I just don't understand enough about the system 
internals to nudge it to correctness and need some expert assistance to tap it 
in the hole.

 
  CG 
  
  
  
   
 -- 
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt   

Re: Removing oids with pg_repack

2023-11-27 Thread CG
 

On Wednesday, November 22, 2023 at 12:38:54 PM EST, Achilleas Mantzios 
 wrote:  
 
  Στις 22/11/23 15:14, ο/η CG έγραψε:
  
 
 
  
  On Wednesday, November 22, 2023 at 01:20:18 AM EST, Achilleas Mantzios 
 wrote:  
  
 Στις 21/11/23 20:41, ο/η CG έγραψε:
  
 
I have a very large PostgreSQL 9.5 database that still has very large 
tables with oids. I'm trying to get rid of the oids with as little downtime as 
possible so I can prep the database for upgrade past PostgreSQL 11. I had a 
wild idea to mod pg_repack to write a new table without oids. I think it almost 
works.  
  To test out my idea I made a new table wipe_oid_test with oids. I filled it 
with a few rows of data. 
  
  But PostgreSQL still thinks that the table has oids: 
mydata=# \d+ wipe_oid_test                    Table "public.wipe_oid_test"  
Column | Type | Modifiers | Storage  | Stats target | Description  
+--+---+--+--+-  k      | 
text | not null  | extended |              |   v      | text |           | 
extended |              |  Indexes:     "wipe_oid_test_pkey" PRIMARY KEY, btree 
(k) Has OIDs: yes
 Except where does it mention in the pg_repack docs (or source) that it is 
meant to be used for NO OIDS conversion ? 
  It does not-- I was trying to leverage and tweak the base functionality of 
pg_repack which sets up triggers and migrates data. I figured if the target 
table was created without OIDs that when pg_repack did the "swap" operation 
that the new table would take over with the added bonus of not having oids. 
 
 I can modify pg_class and set relhasoids = false, but it isn't actually 
eliminating the oid column. `\d+` will report not report that it has oids, but 
the oid column is still present and returns the same result before updating 
pg_class. 
   
 Just Dont! 
  Noted. ;) 
 
  So I'm definitely missing something. I really need a point in the right 
direction Please help! ;) 
   
 
There are a few of methods to get rid of OIDs :
 - ALTER TABLE .. SET WITHOUT OIDS (just mentioning, you already checked that)
  
  This makes the database unusable for hours and hours and hours because it 
locks the table entirely while it performs the operation. That's just something 
that we can't afford. 
  - Use table copy +  use of a trigger to log changes : 
https://dba.stackexchange.com/questions/259359/eliminating-oids-while-upgrading-postgresql-from-9-4-to-12
 
  That SO is not quite the effect I'm going for. The poster of that SO was 
using OIDS in their application and needed a solution to maintain those values 
after conversion. I simply want to eliminate them without the extraordinary 
downtime the database would experience during ALTER operations.  Sorry I 
meant this one : Stripping OIDs from tables in preparation for pg_upgrade



| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Stripping OIDs from tables in preparation for pg_upgrade

I have a postgres database in RDS, file size approaching 1TB. We started in 
2005, using ruby/activerecord/rails...
 |

 |

 |



This is the same idea as the percona ETL strategy, and essentially 90% of what 
pg_repack already does (creates new tables, sets up triggers, locks the tables, 
and swaps new for old at the end of the process) 

 
  
  
- Use of Inheritance (the most neat solution I have seen, this is what I used 
for a 2TB table conversion) : 
https://www.percona.com/blog/performing-etl-using-inheritance-in-postgresql/
This is closest to the effect I was going for. pg_repack essentially 
creates a second table and fills it with the data from the first table while 
ensuring standard db operations against that table continue to function while 
the data is being moved from the old table to the new table. The process 
outlined in the Percona ETL strategy has to be repeated per-table, which is 
work I was hoping to avoid by leveraging 95% of the functionality of pg_repack 
while supplying my own 5% as the resulting table would not have oids regardless 
of the source table's configuration. 
  For my experiment, Table A did have oids. Table B (created by pg_repack) did 
not (at least at creation). When the "swap" operation happened in pg_repack, 
the metadata for Table A was assigned to Table B. I'm just trying to figure out 
what metadata I need to change in the system tables to reflect the actual table 
structure.  
  I have the fallback position for the Percona ETL strategy. But I feel like 
I'm REALLY close with pg_repack and I just don't understand enough about the 
system internals to nudge it to correctness and need some expert assistance to 
tap it in the hole. 
 Why don't just inspect the code pg_repack ? 
I have, and I have modified pg_repack (modification was shown in my first post) 
to create and write to a new table without oids, the problem is when the "swap" 
operation happ

Off-label use for pg_repack

2023-11-28 Thread CG
Hi fellow list members. I hit a brick wall with my last question. I'd like to 
try this again. 
I need to remove OIDs from tables without locking the tables for long periods 
of time. I have developed a strategy that seems to work, but I would like the 
experts to weigh in since I'm planning on doing things to the system tables 
that are generally frowned upon.
Prior to running pg_repack I perform these modifications:
mydata=# update pg_class set relhasoids = false where oid = 
'a_very_large_table_with_oids'::regclass::oid;UPDATE 1mydata=# delete from 
pg_attribute where attrelid = 'a_very_large_table_with_oids'::regclass::oid and 
attname = 'oid';DELETE 1mydata=# \d+ a_very_large_table_with_oids;          
Table "public.a_very_large_table_with_oids" Column | Type | Modifiers | Storage 
 | Stats target | Description 
+--+---+--+--+- k      | 
text | not null  | extended |              |  v      | text |           | 
extended |              | Indexes:    "a_very_large_table_with_oids_pkey" 
PRIMARY KEY, btree (k)
mydata=# select oid,* from a_very_large_table_with_oids;ERROR:  column "oid" 
does not exist
So far so good. I can insert update and delete rows, but the table structure on 
disk is unchanged. 
So after those modifications I repack the table with vanilla pg_repack. That 
copies the data to a fresh new table, sans oids.
Before I start performing these operations on-line on the production data I 
wanted some expert eyes on this process since this is mission critical stuff. 
On the very large tables we will be in this limbo state for an extended period 
of time where pg_class and pg_attribute will have those forced modifications 
while pg_repack works its magic. Is there anything to be concerned about if 
insert/update/delete seems to be working? My insert/update/deletes on the 
tables while pg_repack is running seem to work fine. I also tried this on 
tables that have toast tables attached and upon first glance, everything seems 
to be in order.
What have I missed?