Re: Logical replication, need to reclaim big disk space

2025-05-20 Thread Moreno Andreo



On 19/05/25 20:49, Achilleas Mantzios wrote:


On 19/5/25 17:38, Moreno Andreo wrote:




On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them from 
bytea fields in a table to external storage (making database 
smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea field 
to NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = )

This results, at the end of the operations, to a table that's 
less than one tenth in size.
We have a multi-tenant architecture (100s of schemas with 
identical architecture, all inheriting from public) and we are 
performing the task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what 
you are trying to achieve. Years after, I am seriously considering 
moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and 
database maintenance when needed, cluster/replica management, etc. 
With a smaller database size they would be easier in timing and 
effort, right?
Ok, but you'll lose replica functionality for those blobs, which 
means you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are 
protected with Object Versioning and soft deletion, this should 
assure a good protection against e.g. ransomware, if someone manages 
to get in there (and if this happens, we'll have bigger troubles than 
this).
PostgreSQL has become very popular because of ppl who care about their 
data.
Yeah, it's always been famous for its robustness, and that's why I chose 
PostgreSQL more than 10 years ago, and, in spite of how a "normal" user 
treats his PC, we never had corruption (only where FS/disk were failing, 
but that's not PG fault)
We are mostly talking about costs, here. To give things their 
names, I'm moving bytea contents (85% of total data) to files into 
Google Cloud Storage buckets, that has a fraction of the cost of 
the disks holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more than 
8,5 TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he 
needs to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea fields 
in databases is not recommended, but might have misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is the 
longevity or expected legitimacy of those ? I haven't worked with 
those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to a 
0.17$/GB/month of an SSD or even more for the Hyperdisks we are 
using, is a good price drop.

How about hosting your data in your own storage and spend 0$/GB/month ?
If we could host on our own hardware I'd not be here talking. Maybe we 
would have a 10-node full-mesh multimaster architecture with barman 
backup on 2 separate SANs.
But we are a small company that has to balance performance, consistency, 
security and, last but not latter, costs. And margins are tightening.



**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 
times the buckets cost.


can you actually mount those Cloud Storage Buckets under a supported 
FS in linux and just move them to tablespaces backed by this storage ?


Never tried, I mounted this via FUSE and had some simple operations 
in the past, but not sure it can handle database operations in terms 
of I/O bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or just 
upgrade or move the specific server which holds those data I will 
have an extra headache. Also this is a single point of failure, or 
best case a cause for fragmented technology introduced just for the 
sake of keeping things out of the DB.
This is managed as an hierarchical disk structure, so the calling 
server may be literally everywhere, it just needs an account (or a 
service account) to get in there ,
and you are locke

Re: Logical replication, need to reclaim big disk space

2025-05-20 Thread Achilleas Mantzios

Στις 20/5/25 12:17, ο/η Moreno Andreo έγραψε:




On 19/05/25 20:49, Achilleas Mantzios wrote:


On 19/5/25 17:38, Moreno Andreo wrote:




On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them from 
bytea fields in a table to external storage (making database 
smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea field 
to NULL.
(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = 
)


This results, at the end of the operations, to a table that's 
less than one tenth in size.
We have a multi-tenant architecture (100s of schemas with 
identical architecture, all inheriting from public) and we are 
performing the task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what 
you are trying to achieve. Years after, I am seriously 
considering moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and 
database maintenance when needed, cluster/replica management, etc. 
With a smaller database size they would be easier in timing and 
effort, right?
Ok, but you'll lose replica functionality for those blobs, which 
means you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are 
protected with Object Versioning and soft deletion, this should 
assure a good protection against e.g. ransomware, if someone manages 
to get in there (and if this happens, we'll have bigger troubles 
than this).
PostgreSQL has become very popular because of ppl who care about 
their data.
Yeah, it's always been famous for its robustness, and that's why I 
chose PostgreSQL more than 10 years ago, and, in spite of how a 
"normal" user treats his PC, we never had corruption (only where 
FS/disk were failing, but that's not PG fault)
We are mostly talking about costs, here. To give things their 
names, I'm moving bytea contents (85% of total data) to files into 
Google Cloud Storage buckets, that has a fraction of the cost of 
the disks holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more than 
8,5 TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he 
needs to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea 
fields in databases is not recommended, but might have 
misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is 
the longevity or expected legitimacy of those ? I haven't worked 
with those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to 
a 0.17$/GB/month of an SSD or even more for the Hyperdisks we are 
using, is a good price drop.

How about hosting your data in your own storage and spend 0$/GB/month ?
If we could host on our own hardware I'd not be here talking. Maybe we 
would have a 10-node full-mesh multimaster architecture with barman 
backup on 2 separate SANs.
But we are a small company that has to balance performance, 
consistency, security and, last but not latter, costs. And margins are 
tightening.



**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 
times the buckets cost.


can you actually mount those Cloud Storage Buckets under a 
supported FS in linux and just move them to tablespaces backed by 
this storage ?


Never tried, I mounted this via FUSE and had some simple operations 
in the past, but not sure it can handle database operations in terms 
of I/O bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or just 
upgrade or move the specific server which holds those data I will 
have an extra headache. Also this is a single point of failure, or 
best case a cause for fragmented technology introduced just for the 
sake of keeping things out of the DB.
This is managed as an hierarchical disk structure, so the calling 
server may be literally everywhere, it just needs an account (or a 

Do stuck replication slots prevent autovacuum of running entirely?

2025-05-20 Thread Marcelo Fernandes
Hi there,

I am trying to understand if a stuck replication slot would be sufficient to
stop an autovacuum of even starting.

Couldn't the autovacuum process start, but fail to remove dead tuples that
are still necessary by the replication slot? Why would it prevent autovacuum
of even starting instead?

Thanks.




Re: Logical replication, need to reclaim big disk space

2025-05-20 Thread Moreno Andreo



On 20/05/25 12:58, Achilleas Mantzios wrote:


Στις 20/5/25 12:17, ο/η Moreno Andreo έγραψε:




On 19/05/25 20:49, Achilleas Mantzios wrote:


On 19/5/25 17:38, Moreno Andreo wrote:




On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them 
from bytea fields in a table to external storage (making 
database smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea 
field to NULL.
(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = 
)


This results, at the end of the operations, to a table that's 
less than one tenth in size.
We have a multi-tenant architecture (100s of schemas with 
identical architecture, all inheriting from public) and we are 
performing the task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what 
you are trying to achieve. Years after, I am seriously 
considering moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and 
database maintenance when needed, cluster/replica management, 
etc. With a smaller database size they would be easier in timing 
and effort, right?
Ok, but you'll lose replica functionality for those blobs, which 
means you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are 
protected with Object Versioning and soft deletion, this should 
assure a good protection against e.g. ransomware, if someone 
manages to get in there (and if this happens, we'll have bigger 
troubles than this).
PostgreSQL has become very popular because of ppl who care about 
their data.
Yeah, it's always been famous for its robustness, and that's why I 
chose PostgreSQL more than 10 years ago, and, in spite of how a 
"normal" user treats his PC, we never had corruption (only where 
FS/disk were failing, but that's not PG fault)
We are mostly talking about costs, here. To give things their 
names, I'm moving bytea contents (85% of total data) to files 
into Google Cloud Storage buckets, that has a fraction of the 
cost of the disks holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more 
than 8,5 TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he 
needs to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea 
fields in databases is not recommended, but might have 
misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is 
the longevity or expected legitimacy of those ? I haven't worked 
with those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to 
a 0.17$/GB/month of an SSD or even more for the Hyperdisks we are 
using, is a good price drop.

How about hosting your data in your own storage and spend 0$/GB/month ?
If we could host on our own hardware I'd not be here talking. Maybe 
we would have a 10-node full-mesh multimaster architecture with 
barman backup on 2 separate SANs.
But we are a small company that has to balance performance, 
consistency, security and, last but not latter, costs. And margins 
are tightening.



**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 
times the buckets cost.


can you actually mount those Cloud Storage Buckets under a 
supported FS in linux and just move them to tablespaces backed by 
this storage ?


Never tried, I mounted this via FUSE and had some simple operations 
in the past, but not sure it can handle database operations in 
terms of I/O bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or 
just upgrade or move the specific server which holds those data I 
will have an extra headache. Also this is a single point of 
failure, or best case a cause for fragmented technology introduced 
just for the sake of keeping things out of the DB.
This is managed as an hierarchical disk structure, so the calling 
server may be liter

Re: Do stuck replication slots prevent autovacuum of running entirely?

2025-05-20 Thread Laurenz Albe
On Wed, 2025-05-21 at 16:34 +1200, Marcelo Fernandes wrote:
> I am trying to understand if a stuck replication slot would be sufficient to
> stop an autovacuum of even starting.
> 
> Couldn't the autovacuum process start, but fail to remove dead tuples that
> are still necessary by the replication slot? Why would it prevent autovacuum
> of even starting instead?

I cannot think of a reason why an abandoned replication slot (not sure what
you mean with "stuck") would keep an autovacuum worker from starting.

Typical reasons why autovacuum doesn't start running on a table are:

- there are already "autovacuum_max_workers" worker processes running

- the thresholds for dead or inserted tuples have not been crossed

- the statistics collector has a problem and doesn't gather statistics;
  this applies mostly to v14 and older, see
  https://www.cybertec-postgresql.com/en/stale-statistics-cause-table-bloat/

- the parameter "track_activities" was disabled, so that PostgreSQL doesn't
  collect statistics

- the functions "pg_stat_reset" or "pg_stat_reset_single_table_counters"
  are called repeatedly and wipe out table statistics

Yours,
Laurenz Albe