Re: SyncRepLock acquired exclusively in default configuration

Asim Praveen Wed, 19 Aug 2020 06:22:01 -0700


> On 12-Aug-2020, at 12:02 PM, Masahiko Sawada 
> <masahiko.saw...@2ndquadrant.com> wrote:
> 
> On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pa...@vmware.com> wrote:
>> 
>> 
>> 
>>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmh...@gmail.com> wrote:
>>> 
>>> I think this gets to the root of the issue. If we check the flag
>>> without a lock, we might see a slightly stale value. But, considering
>>> that there's no particular amount of time within which configuration
>>> changes are guaranteed to take effect, maybe that's OK. However, there
>>> is one potential gotcha here: if the walsender declares the standby to
>>> be synchronous, a user can see that, right? So maybe there's this
>>> problem: a user sees that the standby is synchronous and expects a
>>> transaction committing afterward to provoke a wait, but really it
>>> doesn't. Now the user is unhappy, feeling that the system didn't
>>> perform according to expectations.
>> 
>> Yes, pg_stat_replication reports a standby in sync as soon as walsender 
>> updates priority of the standby to something other than 0.
>> 
>> The potential gotcha referred above doesn’t seem too severe.  What is the 
>> likelihood of someone setting synchronous_standby_names GUC with either “*” 
>> or a standby name and then immediately promoting that standby?  If the 
>> standby is promoted before the checkpointer on master gets a chance to 
>> update sync_standbys_defined in shared memory, commits made during this 
>> interval on master may not make it to standby.  Upon promotion, those 
>> commits may be lost.
> 
> I think that if the standby is quite behind the primary and in case of
> the primary crashes, the likelihood of losing commits might get
> higher. The user can see the standby became synchronous standby via
> pg_stat_replication but commit completes without a wait because the
> checkpointer doesn't update sync_standbys_defined yet. If the primary
> crashes before standby catching up and the user does failover, the
> committed transaction will be lost, even though the user expects that
> transaction commit has been replicated to the standby synchronously.
> And this can happen even without the patch, right?
>


It is correct that the issue is orthogonal to the patch upthread and I’ve marked
the commitfest entry as ready-for-committer.

Regarding the issue described above, the amount by which the standby is lagging
behind the primary does not affect the severity.  A standby’s state will be
reported as “sync” to the user only after the standby has caught up (state ==
WALSNDSTATE_STREAMING).  The time it takes for the checkpointer to update the
sync_standbys_defined flag in shared memory is the important factor.  Once
checkpointer sets this flag, commits start waiting for the standby (as long as
it is in-sync).

Asim

Re: SyncRepLock acquired exclusively in default configuration

Reply via email to