From: "David G. Johnston" <david.g.johns...@gmail.com>
Date: Friday, May 8, 2020 at 11:48 AM
To: "Fehrle, Brian" <bfeh...@comscore.com>
Cc: "pgsql-gene...@postgresql.org" <pgsql-gene...@postgresql.org>
Subject: Re: Thoughts on how to avoid a massive integer update.


[External Email]
On Mon, May 4, 2020 at 2:33 PM Fehrle, Brian 
<bfeh...@comscore.com<mailto:bfeh...@comscore.com>> wrote:
I NEED to update every single row in all these tables, changing the integer 
value to a different integer.

Does anyone have any hackery ideas on how to achieve this in less time?

Probably the only solution that would perform computationally faster would take 
the same amount of time or more to code and debug, and be considerably riskier. 
 Basically shut down PostgreSQL and modify the data files directly to change 
one integer byte sequence to another.  On the positive side the source code for 
PostgreSQL is open source and that data, while complex, is structured.

On the point of "vacuum" versus "vacuum full" - I don't know if this is how it 
would work in reality but conceptually if you updated half the table, vacuumed, 
updated the second half, vacuumed, the second batch of updates would reuse the 
spaced freed from the first batch and you'd only increase the disk consumption 
by 1.5 instead of 2.0.  As you increase the number of batches the percentage of 
additional space consumed decreases.  Though if you have the space I'd have to 
imagine that creating a brand new table and dropping the old one would be the 
best solution when taken in isolation.

David J.


Modifying data files is too risky and I wouldn’t be able to get that kind of 
work approved.

Even with keeping excess space to an additional 50%, that’s tons of storage I’d 
need to order, so either vacuum full or re-create tables for minimal on disk 
usage are my only options.

Reply via email to