Re: Synchronizing slots from primary to standby

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote: > > > Attach a new vers

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 02:35:22PM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 1:08 PM Bertrand Drouvot > wrote: > > > > On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote: > > > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot >

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > wrote: > > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote: > > > > > > Commit message states: "why we can't just u

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-31 Thread Bertrand Drouvot
Hi, On Mon, Apr 01, 2024 at 09:04:43AM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot > wrote: > > > > On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote: > > > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot > > > w

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
use this field. It's usually better to just expose the data, and if the user needs helps to make sense of that data, then give them that help separately. " [1]: https://www.postgresql.org/message-id/CAA4eK1JtKieWMivbswYg5FVVB5FugCftLvQKVsxh%3Dm_8nk04vw%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CA%2BTgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
pBuildSerialize() maintains a "small list" of "already serialized" snapshots. [1]: https://www.postgresql.org/message-id/ZgayTFIhLfzhpHci%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
Hi, On Sun, Mar 31, 2024 at 10:25:46AM +0530, Bharath Rupireddy wrote: > On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot > wrote: > > I think in this case it should always reflect the value from the primary (so > > that one can understand why it is invalidated). > > I

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
Hi, On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote: > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot > wrote: > > Then there is no need to call WaitForStandbyConfirmation() as it could go > > until > > the RecoveryInProgress() in StandbySlotsHaveCaugh

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 04:24:49AM +, Zhijie Hou (Fujitsu) wrote: > On Monday, April 1, 2024 9:28 PM Bertrand Drouvot > wrote: > > > > On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote: > > > On Mon, Apr 1, 2024 at 2:51 PM Bertrand D

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
pdate > inactive_since in a different way than other parameters. > > Or a simple solution is that the slotsync worker updates > inactive_since as it does for non-synced slots, and disables > timeout-based slot invalidation for synced slots. Yeah, I think the main question to help us decide is: do we want to invalidate "inactive" synced slots locally (in addition to synchronizing the invalidation from the primary)? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bertrand Drouvot
if (AmCheckpointerProcess() || AmBackgroundWriterProcess()) INJECTION_POINT("bgw-log-standby-snapshot"); And make use of it in the test, something like: $node_primary->safe_psql('postgres', "SELECT injection_points_attach(&#x

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 12:41:35PM +0530, Bharath Rupireddy wrote: > On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot > wrote: > > > > > Or a simple solution is that the slotsync worker updates > > > inactive_since as it does for non-synced slots, and dis

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 02:19:30PM +0530, Amit Kapila wrote: > On Tue, Apr 2, 2024 at 1:54 PM Bertrand Drouvot > wrote: > > What about adding a "wait" injection point in LogStandbySnapshot() to > > prevent > > checkpointer/bgwriter to log a standby snapshot

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bertrand Drouvot
# Check that the captured time is sane + if (defined $reference_time) + { s/Check that the captured time is sane/Check that the inactive_since is sane/? Sorry if some of those comments could have been done while I did review v29-0001. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
on primary due to inactive timeout GUC. Also, check the logical s/inactive timeout GUC/replication_slot_inactive_timeout/? CR9 === +# Start: Helper functions used for this test file +# End: Helper functions used for this test file I think that's the first TAP test with this comment. Not say

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
ing <= we are not testing that it must get its own inactive_since (as we allow them to be equal in the test). I think we should just add some usleep() where appropriate and deny equality during the tests on inactive_since. Except for the above, v32-0001 LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
Hi, On Wed, Apr 03, 2024 at 08:28:04PM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot > wrote: > > > > Just one comment on v32-0001: > > > > +# Synced slot on the standby must get its own inactive_since.

Re: Autogenerate some wait events code and documentation

2024-04-04 Thread Bertrand Drouvot
Hi, On Thu, Apr 04, 2024 at 03:50:21PM +0900, Michael Paquier wrote: > On Tue, Mar 19, 2024 at 07:34:09AM +0000, Bertrand Drouvot wrote: > > I'm not sure as v2 used the "version >= 17" wording which I think would not > > need > > manual refresh

Re: Autogenerate some wait events code and documentation

2024-04-04 Thread Bertrand Drouvot
Hi, On Thu, Apr 04, 2024 at 07:14:47PM +0900, Michael Paquier wrote: > On Thu, Apr 04, 2024 at 09:28:36AM +0000, Bertrand Drouvot wrote: > > +# No "Backpatch" region here as code is generated automatically. > > > > What about "region here as has

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Bertrand Drouvot
lot's LSN(%X/%X) ", +remote_slot->name, + LSN_FORMAT_ARGS(slot->data.confirmed_flush), + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)); I don't think that the message is corr

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 09:43:35AM +0530, shveta malik wrote: > On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > > > On Thu, Apr 4, 2024 at 2:

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 11:21:43AM +0530, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot > wrote: > Please find the attached v36 patch. Thanks! A few comments: 1 === + +The timeout is measured from the time since the slot h

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 04:09:01PM +0530, shveta malik wrote: > On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot > wrote: > > > > What about something like? > > > > ereport(LOG, > > errmsg("synchronized confirmed_flush_lsn for

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
aybe as a first step we should move the "elog(DEBUG2," message as proposed above to help debugging (that could help to confirm the above theory). If the theory is proven then I'm not sure we need the extra complexity of injection point here, maybe just relying on the slots created via SQL API could be enough. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 02:35:42PM +, Bertrand Drouvot wrote: > I think that maybe as a first step we should move the "elog(DEBUG2," message > as > proposed above to help debugging (that could help to confirm the above > theory). If you agree and think that m

Re: Synchronizing slots from primary to standby

2024-04-06 Thread Bertrand Drouvot
Hi, On Sat, Apr 06, 2024 at 10:13:00AM +0530, Amit Kapila wrote: > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > wrote: > > I think the new LSN can be visible only when the corresponding WAL is > written by XLogWrite(). I don't know what in XLogSetAsyncXactLSN() can &

Re: Synchronizing slots from primary to standby

2024-02-12 Thread Bertrand Drouvot
the primary for example). 009 === Regarding 040_standby_failover_slots_sync.pl what about adding tests for? - synced slot invalidation (and ensure it's recreated once pg_sync_replication_slots() is called and when the slot in primary is valid) - cannot enable failover for a temporary replication slot - replication slots can only be synchronized from a standby server Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-12 Thread Bertrand Drouvot
Hi, On Mon, Feb 12, 2024 at 04:19:33PM +0530, Amit Kapila wrote: > On Mon, Feb 12, 2024 at 3:33 PM Bertrand Drouvot > wrote: > > > > A few random comments: > > > > > > 003 === > > > > + If, after executing the function, > > +

Re: Synchronizing slots from primary to standby

2024-02-13 Thread Bertrand Drouvot
coverage? (I've in mind 731, 739, 766, 778, 786, 796, 808) [1]: https://htmlpreview.github.io/?https://raw.githubusercontent.com/bdrouvot/pg_code_coverage/main/src/backend/replication/logical/slotsync.c.gcov.html Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-13 Thread Bertrand Drouvot
Hi, On Tue, Feb 13, 2024 at 05:20:35PM +0530, Amit Kapila wrote: > On Tue, Feb 13, 2024 at 4:59 PM Bertrand Drouvot > wrote: > > - 84% of the slotsync.c code is covered, the parts that are not are mainly > > related to "errors". > > > > Worth to try to ext

Re: Synchronizing slots from primary to standby

2024-02-14 Thread Bertrand Drouvot
n LOGs on BF and a reproducer > > by Hou-San, so still, there is some chance that this doesn't fix the BF > > failures in > > which case I'll again look into those. > > I have verified that the patch can fix the issue on my machine(after adding > few > more checkpoints before slot invalidation test.) I also added one more check > in > the test to confirm the synced slot is not temp slot. Here is the v2 patch. Thanks! +# To ensure that restart_lsn has moved to a recent WAL position, we need +# to log XLOG_RUNNING_XACTS and make sure the same is processed as well +$primary->psql('postgres', "CHECKPOINT"); Instead of "CHECKPOINT" wouldn't a less heavy "SELECT pg_log_standby_snapshot();" be enough? Not a big deal but maybe we could do the change while modifying 040_standby_failover_slots_sync.pl in the next patch "Add a new slotsync worker". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: 035_standby_logical_decoding unbounded hang

2024-02-14 Thread Bertrand Drouvot
p_sub' > + ]) or die "timed out waiting for logical slot to calculate its > restart_lsn"; > + What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm and then make use of it in create_logical_slot_on_standby() and above? (something like wait_for_restart_lsn_calculation

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 02:49:54PM +0530, Amit Kapila wrote: > On Thu, Feb 15, 2024 at 9:05 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Thursday, February 15, 2024 10:49 AM Amit Kapila > > wrote: > > > > > > On Wed, Feb 14, 2024 at 7:26 PM Bertrand

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 02:09:45PM +0900, Michael Paquier wrote: > On Thu, Jan 18, 2024 at 02:20:28PM +0000, Bertrand Drouvot wrote: > > On Thu, Jan 18, 2024 at 04:59:39PM +0530, Bharath Rupireddy wrote: > >> I'm not sure if it > >> can happen at all, but

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
an output to pg_sync_replication_slots()? The output could be the number of sync slots that have been created and are not considered as sync-ready during the execution. I think that could be a good addition to v2-0001-Add-a-log-if-remote-slot-didn-t-catch-up-to-local.patch proposed here (should trigger

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
onfig(bool restart) +{ worth to add test(s) in 040_standby_failover_slots_sync.pl for it? [1]: https://www.postgresql.org/message-id/CAA4eK1JcBG6TJ3o5iUd4z0BuTbciLV3dK4aKgb7OgrNGoLcfSQ%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 05:58:47PM +0530, Amit Kapila wrote: > On Thu, Feb 15, 2024 at 5:46 PM Bertrand Drouvot > wrote: > > Also I was thinking: what about adding an output to > > pg_sync_replication_slots()? > > The output could be the number of sync slots that h

Re: 035_standby_logical_decoding unbounded hang

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 12:48:16PM -0800, Noah Misch wrote: > On Wed, Feb 14, 2024 at 03:31:16PM +0000, Bertrand Drouvot wrote: > > What about creating a sub, say wait_for_restart_lsn_calculation() in > > Cluster.pm > > and then make use of it in create_logical_slot_o

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
48K ./src/test/recovery/tmp_check/log/040_standby_failover_slots_sync_standby1.log 4.0K ./src/test/recovery/tmp_check/log/040_standby_failover_slots_sync_subscriber1.log 12K ./src/test/recovery/tmp_check/log/regress_log_040_standby_failover_slots_sync Regards, -- Bertrand Drouvot PostgreS

Re: Synchronizing slots from primary to standby

2024-02-16 Thread Bertrand Drouvot
to this catalog_xmin of primary's > slot would precede standby's catalog_xmin and we see this failure. > > As per this theory, we should disable autovacuum on primary to avoid > updates to catalog_xmin values. Makes sense to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-18 Thread Bertrand Drouvot
Hi, On Mon, Feb 19, 2024 at 03:14:07PM +0900, Michael Paquier wrote: > On Thu, Feb 15, 2024 at 11:24:59AM +0000, Bertrand Drouvot wrote: > > Thanks for looking at it! > > > > Agree, using an assertion instead in v3 attached. > > And you did not forget the PG_USED_F

Re: Synchronizing slots from primary to standby

2024-02-18 Thread Bertrand Drouvot
Hi, On Sat, Feb 17, 2024 at 10:10:18AM +0530, Amit Kapila wrote: > On Fri, Feb 16, 2024 at 4:10 PM shveta malik wrote: > > > > On Thu, Feb 15, 2024 at 10:48 PM Bertrand Drouvot > > wrote: > > > > > 5 === > > &

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
Hi, On Fri, Feb 16, 2024 at 08:17:41PM +0100, Magnus Hagander wrote: > On Fri, Jan 19, 2024 at 7:20 AM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Jan 18, 2024 at 04:01:33PM +0100, Magnus Hagander wrote: > > > On Mon, Jan 15, 2024 at 11:17 AM Ber

Avoid switching between system-user and system-username in the doc

2024-02-19 Thread Bertrand Drouvot
/CAOBaU_Yp08MQOK7_k4QVyxL6sf7TURGpjX3tn1Z%2BWxJo2x7%2BGQ%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com >From 379473a4d7172042107ef587e430f168bc133ad5 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date:

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
id/ZdMWux1HpIebkEmd%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
Hi, On Fri, Feb 16, 2024 at 09:41:41PM +0100, Magnus Hagander wrote: > On Fri, Feb 16, 2024 at 8:55 PM Andres Freund wrote: > > > > Hi, > > > > On 2024-01-12 17:16:53 +0100, Magnus Hagander wrote: > > > On Thu, Jan 11, 2024 at 5:55 PM Bertrand Drouvot > &

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-19 Thread Bertrand Drouvot
conflict_prev != conflict)); > > It took a while for me to understand the above condition, can we > simplify it like below using De Morgan's laws for better readability? > > Assert((conflict_prev == RS_INVAL_NONE) || !terminated || > (conflict_prev == conflict)); I don't have a strong opinon on this, looks like a matter of taste. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-19 Thread Bertrand Drouvot
a predicate loop that tests for a specific exit * condition and otherwise sleeps " but is it needed in our particular case here? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-20 Thread Bertrand Drouvot
Hi, On Tue, Feb 20, 2024 at 08:28:28AM +0900, Michael Paquier wrote: > On Mon, Feb 19, 2024 at 02:28:04PM +0000, Bertrand Drouvot wrote: > > If slock_t protects "only" one counter, then what about using > > pg_atomic_uint64 > > or pg_atomic_uint32 instead? And bt

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-20 Thread Bertrand Drouvot
t that was not necessary). I'll polish and propose my POC test once [1] is pushed (unless you're curious about it before). [1]: https://www.postgresql.org/message-id/flat/ZdLuxBk5hGpol91B%40paquier.xyz Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-21 Thread Bertrand Drouvot
Hi, On Wed, Feb 21, 2024 at 07:08:03AM +0900, Michael Paquier wrote: > On Tue, Feb 20, 2024 at 03:55:08PM +0000, Bertrand Drouvot wrote: > > +PG_FUNCTION_INFO_V1(injection_points_wake); > > +Datum > > +injection_points_wake(PG_FUNCTION_ARGS) > > +{ > > > >

Re: Injection points: some tools to wait and wake

2024-02-21 Thread Bertrand Drouvot
akeup(PG_FUNCTION_ARGS) Empty line missing before "Datum"? 6 === Also maybe some comments are missing above injection_point_init_state(), injection_init_shmem() but it's more a Nit. 7 === While at it, should we add a second injection wait point in t/041_invalid_checkpoint_after_promote.pl to check that they are wake up individually? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-02-21 Thread Bertrand Drouvot
using, maybe we should add a few words about it in the doc? Looking at v5-0001: + + invalidation_reason text + + My initial thought was to put "conflict" value in this new field in case of conflict (not to mention the conflict reason in it). With the current pro

Re: Injection points: some tools to wait and wake

2024-02-22 Thread Bertrand Drouvot
Hi, On Thu, Feb 22, 2024 at 12:02:01PM +0900, Michael Paquier wrote: > On Wed, Feb 21, 2024 at 11:50:21AM +0000, Bertrand Drouvot wrote: > > A few comments: > > > > 1 === > > I think "up" is missing at several places in the patch where "wake" is u

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-02-22 Thread Bertrand Drouvot
Hi, On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote: > On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot > wrote: > > My initial thought was to put "conflict" value in this new field in case of > > conflict (not to mention the conflict rea

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
ot sync worker does +* not interact with user tables, eliminating the risk of executing +* arbitrary code within triggers. Right. I did not check but if we are using operators in our remote SPI calls then it would be worth to ensure they are coming from the pg_catalog schema? Using

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Thu, Feb 22, 2024 at 04:01:34PM +0530, shveta malik wrote: > On Thu, Feb 22, 2024 at 3:44 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > Thanks! > > > > Some random comments about v92_001 (Sorry if it has already been discussed > > up-thr

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 08:35:44AM +0530, shveta malik wrote: > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > wrote: > > > > Suppose that in synchronize_slots() the query would be: > > > > const char *query = "SELECT slot_name, plugin, confirmed_fl

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 09:43:48AM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 8:35 AM shveta malik wrote: > > > > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > > wrote: > > > > > > Suppose that in synchronize_slots() the query woul

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 02:15:11PM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 1:28 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > Because one could create say the "=" OPERATOR in their own schema, attach a > > function to it doing

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
ot is still serving the standby. I think we should handle this case differently, thoughts? 11 === +* Specified physical slot have been invalidated, so no point s/have been/has been/? 12 === +++ b/src/backend/replication/slotfuncs.c @@ -22,6 +22,7 @@ #include

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
is empty The reason is that setting standby_slot_names to a non empty value means that one wants the walsender to wait until the standby catchup. The way to remove this intentional behavior should be by changing the standby_slot_names value (not the existence or the state of the slot(s) it points too). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 09:30:58AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, February 23, 2024 5:07 PM Bertrand Drouvot > wrote: > > On Fri, Feb 23, 2024 at 02:15:11PM +0530, shveta malik wrote: > > > > > > Thanks for the details. I understand it

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 02:18:58AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, February 23, 2024 6:12 PM Bertrand Drouvot > wrote: > > + if (!ok) > > + GUC_check_errdetail("List syntax is invalid."); > > + > > + /* >

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 09:13:05AM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot > wrote: > > > > Hi, > > > I think to set secure search path for remote connection, the standard > > > approach > > > could be to ex

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 10:48:38AM +0530, Amit Kapila wrote: > On Fri, Feb 23, 2024 at 4:45 PM Bertrand Drouvot > wrote: > > > > On Fri, Feb 23, 2024 at 09:46:00AM +, Zhijie Hou (Fujitsu) wrote: > > > > > > Besides, I'd li

Re: Injection points: some tools to wait and wake

2024-02-26 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 12:57:09PM +0900, Michael Paquier wrote: > On Thu, Feb 22, 2024 at 08:00:24AM +0000, Bertrand Drouvot wrote: > > +/* Maximum number of wait usable in injection points at once */ > > > > s/Maximum number of wait/Maximum number of waits/ ? >

Re: Documentation: warn about two_phase when altering a subscription

2024-02-26 Thread Bertrand Drouvot
> I feel this makes the description flow a bit better when reading. But other > than that I think it's quite clear. Thanks! [1]: https://www.postgresql.org/message-id/CAA4eK1KdZMtJfo%3DK%3DXWsQQu8OHutT_dwJV2D3zs4h9g5zdNz2A%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-26 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 05:18:25PM +0530, Amit Kapila wrote: > On Mon, Feb 26, 2024 at 12:59 PM Bertrand Drouvot > wrote: > > > > 10 === > > > > > > > > + i

Re: Synchronizing slots from primary to standby

2024-02-26 Thread Bertrand Drouvot
we need to check atleast syntax before > we return due to NULL ReplicationSlotCtl. We get NULL > ReplicationSlotCtl during instance startup in > check_standby_slot_names() as postmaster first loads GUC-table and > then initializes shared-memory for replication slots. See calls of > Init

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-26 Thread Bertrand Drouvot
Hi, On Tue, Feb 20, 2024 at 04:03:53PM +, Bertrand Drouvot wrote: > Hi, > > On Tue, Feb 20, 2024 at 02:33:44PM +0900, Michael Paquier wrote: > > On Tue, Feb 20, 2024 at 08:51:17AM +0900, Michael Paquier wrote: > > > Prefixing these with "initial_&q

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
r WARNINGS later in the > FilterStandbySlots. Maybe we don't need the double-checking and it is > enough to check in FilterStandbySlots? Good point, I have the feeling that it is enough to check in FilterStandbySlots(). Indeed, if the value is syntactically correct, then I think that it

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
est easier to read / cleaner, I think it may make them more difficult to understand as 'await_injection_point' would probably be too generic. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
Hi, On Tue, Feb 27, 2024 at 04:49:03PM +0500, Andrey M. Borodin wrote: > Hi, > > > On 27 Feb 2024, at 16:08, Bertrand Drouvot > > wrote: > > > > On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote: > >> > >> But, AFAICS, th

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote: > On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote: > > So, I'm ok with the new helper too. > > If both of you feel strongly about that, I'm OK with introducing > something like that.

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote: > On Mon, Feb 26, 2024 at 9:13 AM shveta malik wrote: > > > > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > I think to set sec

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 06:48:37AM +, Zhijie Hou (Fujitsu) wrote: > On Wednesday, February 28, 2024 2:38 PM Bertrand Drouvot > wrote: > > On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote: > > > On Mon, Feb 26, 2024 at 9:13 AM shveta malik > > wr

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
slots. If the worker was not prepared +# to handle such attacks, it would have failed during Worth to mention the underlying check / function that would get an "unexpected" result? Except for the above (nit) comments the patch looks good to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
=== Regarding the test, what about adding one to test the "new" behavior discussed up-thread? (logical replication will wait if slot mentioned in standby_slot_names is dropped and/or does not exist when the engine starts?) Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
ction > > instead of worker to test the operator-redirection using search-patch. > > This has been done to simplify the test case and reduce the added > > time. Thanks! > I have slightly adjusted the comments in the attached, otherwise, LGTM. Same here, LGTM. Regards, -- B

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
on) is used or not. Based on the above, I did prefer the original proposal but I think we can keep what has been pushed (Peter's proposal). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
Hi, On Thu, Feb 29, 2024 at 02:56:23PM +0900, Michael Paquier wrote: > On Wed, Feb 28, 2024 at 06:20:41AM +0000, Bertrand Drouvot wrote: > > On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote: > >> On Tue, Feb 27, 2024 at 01:39:59PM +, Bertrand Drouvot wrote:

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
Hi, On Thu, Feb 29, 2024 at 02:34:35PM +0500, Andrey M. Borodin wrote: > > > > On 29 Feb 2024, at 13:35, Bertrand Drouvot > > wrote: > > > > Something like the attached? > > There's extraneous print "done\n"; doh! bad copy/paste ;-) >

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
> > The log should provide some context about the caller failing, meaning > that the backend type and the injection point name should be mentioned > in these logs to help in debugging issues. Yeah, done in v3 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS O

Re: Synchronizing slots from primary to standby

2024-02-29 Thread Bertrand Drouvot
e for all the ones that rely on pg_logical_slot_get_changes_guts(), means: - pg_logical_slot_get_changes - pg_logical_slot_peek_changes - pg_logical_slot_get_binary_changes - pg_logical_slot_peek_binary_changes Not sure it's worth to mention the "binary" ones though as their doc mention the

Missing LWLock protection in pgstat_reset_replslot()

2024-03-01 Thread Bertrand Drouvot
Hi hackers, I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, we don't have any guarantee that the slot is active (then preventing it to be dropped/recreated) when this function is executed. Attached a patch to add the missing protection. Regards, -- Bertrand Dr

Re: Regardign RecentFlushPtr in WalSndWaitForWal()

2024-03-01 Thread Bertrand Drouvot
else if (!RecoveryInProgress()) > RecentFlushPtr = GetFlushRecPtr(NULL); > else > RecentFlushPtr = GetXLogReplayRecPtr(NULL); > It seems to me that [2] alone could be sufficient. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-03-03 Thread Bertrand Drouvot
Hi, On Mon, Mar 04, 2024 at 10:44:34AM +0900, Michael Paquier wrote: > On Fri, Mar 01, 2024 at 06:52:45AM +0000, Bertrand Drouvot wrote: > > + if (defined($backend_type)) > > + { > > + $backend_type = qq('$backend_type'); > > +

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-04 Thread Bertrand Drouvot
l slots. I'm > not seeing a strong need for another column. Yeah having two columns was more for convenience purpose. Without the "conflict" one, a slot conflicting with recovery would be "a logical slot having a non NULL invalidation_reason". I'm also fine wi

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
elf -- e.g. > > you have to read the accompanying comment or documentation to have any > > idea of its purpose. > > > > Yeah, one has to read the description but that is true for other > parameters like "temp_tablespaces". I don't have any better ideas but > open

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
WaitForWal(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn, +uint32 *wait_event) Same questions as for NeedToWaitForStandby(). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Statistics Import and Export

2024-03-04 Thread Bertrand Drouvot
t; > Leaving the export/import scripts off for the time being, as they haven't > > changed and the next likely change is to fold them into pg_dump. > > > > > > > v6 posted below. Thanks! I had in mind to look at it but it looks like a rebase is needed. Regards,

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
ted = (slot->data.invalidated != RS_INVAL_NONE); inactive = (slot->active_pid == 0); instead? I think it's easier to read and it looks like this is the way it's written in other places (at least the few I checked). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Ope

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-03-05 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 09:42:20AM +0900, Michael Paquier wrote: > On Mon, Feb 26, 2024 at 02:01:45PM +0000, Bertrand Drouvot wrote: > > Though [1] mentioned up-thread is not pushed yet, I'm Sharing the POC patch > > now > > (see the attached). > > I h

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 09:55:32AM +0200, Heikki Linnakangas wrote: > On 01/03/2024 12:15, Bertrand Drouvot wrote: > > Hi hackers, > > > > I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, > > we > > don't have any guarantee th

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread Bertrand Drouvot
anged > > to false. > > > > Also don't we need to release the lock when we return here: > > /* > * Nothing to do for physical slots as we collect stats only for logical > * slots. > */ > if (SlotIsPhysical(slot)) > return; D'oh! Thanks! Fixed in v2 shared

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > wrote: > Thanks. Can we try to get rid of multiple LwLockRelease in > pgstat_reset_replslot(). Is this any better? > > /* > -* Nothing to

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-06 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > wrote: > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart w

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 02:47:55PM +0900, Michael Paquier wrote: > On Tue, Mar 05, 2024 at 10:17:03AM +0000, Bertrand Drouvot wrote: > > On Tue, Mar 05, 2024 at 09:42:20AM +0900, Michael Paquier wrote: > > The issue is that then the new ASSERT is > > triggered leading to

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 02:46:57PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > wrote: > > Yeah, I'm okay with one column. > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. Thanks! A few commen

<    1   2   3   4   5   6   7   8   9   >