Hi Ondřej, Thanks for the report. It seems to be a clear violation of what is promised in the docs. Although it's unlikely that someone implemented an application which deals with important data and "pressed Ctr+C" as it's done in psql. So this might be not such a critical issue after all. BTW what version of PostgreSQL are you using?
On Mon, Apr 19, 2021 at 10:13 PM Ondřej Žižka <ondrej.zi...@stratox.cz> wrote: > > Hello all, > I would like to know your opinion on the following behaviour I see for > PostgreSQL setup with synchronous replication. > > This behaviour happens in a special use case. In this use case, there are 2 > synchronous replicas with the following config (truncated): > > - 2 nodes > - synchronous_standby_names='*' > - synchronous_commit=remote_apply > > > With this setup run the following steps (LAN down - LAN between master and > replica): > ----------------- > postgres=# truncate table a; > TRUNCATE TABLE > postgres=# insert into a values (1); -- LAN up, insert has been applied to > replica. > INSERT 0 1 > Vypnu LAN na serveru se standby: > postgres=# insert into a values (2); --LAN down, waiting for a confirmation > from sync replica. In this situation cancel it (press CTRL+C) > ^CCancel request sent > WARNING: canceling wait for synchronous replication due to user request > DETAIL: The transaction has already committed locally, but might not have > been replicated to the standby. > INSERT 0 1 > There will be warning that commit was performed only locally: > 2021-04-12 19:55:53.063 CEST [26104] WARNING: canceling wait for synchronous > replication due to user request > 2021-04-12 19:55:53.063 CEST [26104] DETAIL: The transaction has already > committed locally, but might not have been replicated to the standby. > > postgres=# insert into a values (2); --LAN down, waiting for a confirmation > from sync replica. In this situation cancel it (press CTRL+C) > ^CCancel request sent > WARNING: canceling wait for synchronous replication due to user request > DETAIL: The transaction has already committed locally, but might not have > been replicated to the standby. > INSERT 0 1 > postgres=# insert into a values (2); --LAN down, waiting for sync replica, > second attempt, cancel it as well (CTRL+C) > ^CCancel request sent > WARNING: canceling wait for synchronous replication due to user request > DETAIL: The transaction has already committed locally, but might not have > been replicated to the standby. > INSERT 0 1 > postgres=# update a set n=3 where n=2; --LAN down, waiting for sync replica, > cancel it (CTRL+C) > ^CCancel request sent > WARNING: canceling wait for synchronous replication due to user request > DETAIL: The transaction has already committed locally, but might not have > been replicated to the standby. > UPDATE 2 > postgres=# update a set n=3 where n=2; -- run the same update,because data > from the previous attempt was commited on master, it is sucessfull, but no > changes > UPDATE 0 > postgres=# select * from a; > n > --- > 1 > 3 > 3 > (3 rows) > postgres=# > ------------------------ > > Now, there is only value 1 in the sync replica table (no other values), data > is not in sync. This is expected, after the LAN restore, data will come sync > again, but if the main/primary node will fail and we failover to replica > before the LAN is back up or the storage for this node would be destroyed and > data would not sync to replica before it, we will lose data even if the > client received successful commit (with a warning). > From the synchronous_commit=remote_write level and "higher", I would expect, > that when the remote application (doesn't matter if flush, write or apply) > would not be applied I would not receive a confirmation about the commit > (even with a warning). Something like, if there is no commit from sync > replica, there is no commit on primary and if someone performs the steps > above, the whole transaction will not send a confirmation. > > This can cause issues if the application receives a confirmation about the > success and performs some follow-up steps e.g. create a user account and > sends a request to the mail system to create an account or create a VPN > account. If the scenario above happens, there can exist a VPN account that > does not have any presence in the central database and can be a security > issue. > > I hope I explained it sufficiently. :-) > > Do you think, that would be possible to implement a process that would solve > this use case? > > Thank you > Ondrej -- Best regards, Aleksander Alekseev