Re: Small fixes needed by high-availability tools

Amit Kapila Mon, 12 May 2025 02:41:11 -0700

On Fri, May 2, 2025 at 6:30 PM Andrey Borodin <x4...@yandex-team.ru> wrote:
>
>
> I want to revive attempts to fix some old edge cases of physical quorum 
> replication.
>
> Please find attached draft patches that demonstrate ideas. These patches are 
> not actually proposed code changes, I rather want to have a design consensus 
> first.
>
> 1. Allow checking standby sync before making data visible after crash recovery
>
> Problem: Postgres instance must not allow to read data, if it is not yet 
> known to be replicated.
> Instantly after the crash we do not know if we are still cluster primary. We 
> can disallow new
> connections until standby quorum is established. Of course, walsenders and 
> superusers must be exempt from this restriction.
>
> Key change is following:
> @@ -1214,6 +1215,16 @@ InitPostgres(const char *in_dbname, Oid dboid,
>         if (PostAuthDelay > 0)
>                 pg_usleep(PostAuthDelay * 1000000L);
>
> +       /* Check if we need to wait for startup synchronous replication */
> +       if (!am_walsender &&
> +               !superuser() &&
> +               !StartupSyncRepEstablished())
> +       {
> +               ereport(FATAL,
> +                               (errcode(ERRCODE_CANNOT_CONNECT_NOW),
> +                                errmsg("cannot connect until synchronous 
> replication is established with standbys according to 
> startup_synchronous_standby_level")));
> +       }
>
> We might also want to have some kind of cache that quorum was already 
> established. Also the place where the check is done might be not most 
> appropriate.
>
> 2. Do not allow to cancel locally written transaction
>
> The problem was discussed many times [0,1,2,3] with some agreement on taken 
> approach. But there was concerns that the solution is incomplete without 
> first patch in the current thread.
>
> Problem: user might try to cancel locally committed transaction and if we do 
> so we will show non-replicated data as committed. This leads to loosing data 
> with UPSERTs.
>
> The key change is how we process cancels in SyncRepWaitForLSN().
>


One idea to solve this problem could be that whenever we cancel
sync_rep_wait, we set some system-wide flag that indicates that any
new transaction must ensure that all the current data is replicated to
the synchronous standby. Once we ensure that we have waited for
pending transactions to replicate, we can toggle back that system-wide
flag. Now, if the system restarts for any reason during such a wait,
we can use your idea to disallow new connections until the standby
quorum is established.

-- 
With Regards,
Amit Kapila.

Re: Small fixes needed by high-availability tools

Reply via email to