Hi,
On 11/9/23 11:54 AM, shveta malik wrote:
PFA v32 patches which has below changes:
Thanks!
7) Added warning for cases where a user-slot with the same name is
already present which slot-sync worker is trying to create. Sync for
such slots is skipped.
I'm seeing assertion and segfault in this case due to ReplicationSlotRelease()
in synchronize_one_slot().
Adding this extra check prior to it:
- ReplicationSlotRelease();
+ if (!(found && s->data.sync_state == SYNCSLOT_STATE_NONE))
+ ReplicationSlotRelease();
make them disappear.
Open Question:
1) Currently I have put drop slot logic for slots with 'sync_state=i'
in slot-sync worker. Do we need to put it somewhere in promotion-logic
as well?
Yeah I think so, because there is a time window when one could "use" the slot
after the promotion and before it is removed. Producing things like:
"
2023-11-09 15:16:50.294 UTC [2580462] LOG: dropped replication slot
"logical_slot2" of dbid 5 as it was not sync-ready
2023-11-09 15:16:50.295 UTC [2580462] LOG: dropped replication slot
"logical_slot3" of dbid 5 as it was not sync-ready
2023-11-09 15:16:50.297 UTC [2580462] LOG: dropped replication slot
"logical_slot4" of dbid 5 as it was not sync-ready
2023-11-09 15:16:50.297 UTC [2580462] ERROR: replication slot "logical_slot5"
is active for PID 2594628
"
After the promotion one was able to use logical_slot5 and now we can now drop
it.
Perhaps in WaitForWALToBecomeAvailable() where we call
XLogShutdownWalRcv after checking 'CheckForStandbyTrigger'. Thoughts?
You mean here?
/*
* Check to see if promotion is requested. Note that we do
* this only after failure, so when you promote, we still
* finish replaying as much as we can from archive and
* pg_wal before failover.
*/
if (StandbyMode && CheckForStandbyTrigger())
{
XLogShutdownWalRcv();
return XLREAD_FAIL;
}
If so, that sounds like a good place to me.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com