On Sun, May 26, 2024 at 11:18 PM Torsten Förtsch <tfoertsch...@gmail.com> wrote:
> Each query on the replica has a backend_xmin. You can see that in > pg_stat_activity. From that backend's perspective, tuples marked as deleted > by any transaction greater or equal to backend_xmin are still needed. This > does not depend on the table. > > Now, vacuum writes to the WAL up to which point it has vacuumed on the > master. In pg_waldump this looks like so: > > PRUNE snapshotConflictHorizon: 774, nredirected: 0, ndead: 5, nunused: 0, > redirected: [], dead: [2, 4, 6, 8, 10], unused: [], blkref #0: rel > 1663/5/16430 blk 0 > > That snapshotConflictHorizon is also a transaction id. If the backend_xmin > of all backends running transactions in the same database (the 5 in 16 > 63/5/16430) -as the vacuum WAL record is greater than vacuum's > snapshotConflictHorizon, then there is no conflict. If any of the > backend_xmin's is less, then there is a conflict. > > This type of conflict is determined by just 2 numbers, the conflict > horizon sent by the master in the WAL, and the minimum of all > backend_xmins. For your case this means a long running transaction querying > table t1 might have a backend_xmin of 223. On the master update and delete > operations happen on table T2. Since all the transactions on the master are > fast, when vacuum hits T2, the minimum of all backend_xmins on the master > might already be 425. So, garbage left over by all transactions up to 424 > can be cleaned up. Now that cleanup record reaches the replica. It compares > 223>425 which is false. So, there is a conflict. Now the replica can wait > until its own horizon reaches 425 or it can kill all backends with a lower > backend_xmin. > > As I understand, hot_standby_feedback does not work for you. Not sure if > you can run the query on the master? That would resolve the issues but > might generate the same bloat on the master as hot_standby_feedback. > Another option I can see is to run long running queries on a dedicated > replica with max_standby_streaming_delay set to infinity or something large > enough. If you go that way, you could also fetch the WAL from your > WAL archive instead of replicating from the master. That way the replica > has absolutely no chance to affect the master. > > Thank you so much. Would you agree that we should have two standby, one with default max_standby_streaming_delay (say 10 sec ) which will be mainly used as high availability and thus will be having minimal lag. and another standby with max_standby_streaming_delay as "-1" i.e. it will wait indefinitely for the SELECT queries to finish without caring about the lag, which will be utilized for the long running SELECT queries. And keep the hot_standby_feedback as ON for the first standby which is used as HA/high availability. And keep the hot_standby_feedback as OFF for the second standby which is utilized for long running SELECT queries, so that primary won't be waiting for the response/feedback from this standby to vacuum its old transactions and that will keep the transaction id wrap around issue from not happening because of the Read/Select queries on any of the standby.