________________________________
From: Hannu Krosing <han...@google.com>
To: Greg Sabino Mullane <htamf...@gmail.com>
Cc: Lars Aksel Opsahl <lars.ops...@nibio.no>; Tom Lane <t...@sss.pgh.pa.us>; 
Christophe Pettus <x...@thebuild.com>; pgsql-performance@lists.postgresql.org 
<pgsql-performance@lists.postgresql.org>
Subject: Re: PostgreSQL and a Catch-22 Issue related to dead rows

If there are unremovable rows it usually also means that index-only scan 
degrades to index-scan-with-visibility-checks-in-tables.

I think the ask is to be able to remove the recently dead rows that are not 
visible in any current snapshot and can never become visible to any future 
snapshot,
Something that Tom described as currently not easily doable above.

Maybe we could figure out a way for long-running transactions to upload their 
snapshot set to some meta-vacuum process which can use it to determine which 
rows fall into that category. In this way the change would affect only the 
long-running transactions and if we do it onl maybe once a minute it should not 
be too heavy overhead even for these transactions.
Hi

This is tested but only works when the load is very low.

To fix the problem with xmin we use this 
https://gitlab.com/nibioopensource/postgres_execute_parallel to start the jobs, 
 but the control is returned to the caller at given intervals, and the caller 
also does a commit and gets a new snapshot and xmin is updated. If all jobs 
done are done the caller may start a new batch of jobs.

One of the 750 jobs takes for hours to remove 
(https://postgis.net/docs/ST_RemEdgeNewFace.html) almost 30.000 edges.  I do 
this in a loop takes around 500 edges in every loop. If the time used is to 
long I forced a commit in this loop in addition to the analyze I do. Adding the 
commit really helps on execution time but only when running a very few jobs in 
parallel. With this new commit I could see that dead rows were removed.

The issue with the new commit in the loop where I remove edges is that it 
becomes problem when many treads run parallel related to subtransactions per 
transaction
like discussed here https://postgrespro.com/list/thread-id/2478202 and many 
other threads. Almost like a catch 22, fixing the dead rows problem is causing 
a new bigger problem.

So to solve problems with subtransactions per transaction I can make new 
joblist for removing edges also, but that makes things more complicated.

Thanks

Lars.

Reply via email to