Re: terminating walsender process due to replication timeout

Rene Romero Benavides Tue, 14 May 2019 10:05:35 -0700

To detect network issues maybe you could monitor replication delay.

On Mon, May 13, 2019 at 6:42 AM <ayaho...@ibagroup.eu> wrote:


> Hello PostgreSQL Community!
>
> I faced an issue on my linux machine using Postgres 11.3 .
> I have 2 nodes in db cluster: master and standby.
> I tried to perform a plenty of long-running  queries which lead to the
> databases desynchronization:
> terminating walsender process due to replication timeout
>
> Here is the output in debug mode:
> 2019-05-13 13:21:33 FET 00000 DEBUG:  sending replication keepalive
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 00000 LOG:  terminating walsender process due to
> replication timeout
>
>
> The issue is reproducible. I configure 2 nodes cluster, download
> demo_small.zip from https://edu.postgrespro.ru/ and run the following
> command:
> psql -U user1 -f demo_small.sql db1
> and I get the observed behaviour.
>
>
> I know that I can increase wal_sender_timeout value to avoid this
> behaviour (currently wal_sender_timeout is equal to 1 second.)
> To be honest I don't want to increase wal_sender_timeout because I would
> like to detect some network issues quickly.
>
> After having googled I found that someone faced a similar issue
> https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832...@2ndquadrant.com
> which was fixed in  PostgreSQL 9.4.16.
>
>
> Is my issue the same as described here
> https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832...@2ndquadrant.com
> ?
> Is there any  other chance to avoid it without increasing
> wal_sender_timeout?
>
>
> Thank you in advance.
> Regards,
> Andrei



-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: terminating walsender process due to replication timeout

Reply via email to