On Mon, Sep 20, 2021 at 4:10 PM Fabrice Chapuis <fabrice636...@gmail.com> wrote:
>
> Hi Amit,
>
> We can replay the problem: we load a table of several Gb in the schema of the 
> publisher, this generates the worker's timeout after one minute from the end 
> of this load. The table on which this load is executed is not replicated.
>
> 2021-09-16 12:06:50 CEST [24881]: [1-1] 
> user=postgres,db=db012a00,client=[local] LOG:  duration: 1281408.171 ms  
> statement: COPY db.table (col1, col2) FROM stdin;
>
> 2021-09-16 12:07:11 CEST [12161]: [1-1] user=,db=,client= LOG:  automatic 
> analyze of table "db.table " system usage: CPU: user: 4.13 s, system: 0.55 s, 
> elapsed: 9.58 s
>
> 2021-09-16 12:07:50 CEST [3770]: [2-1] user=,db=,client= ERROR:  terminating 
> logical replication worker due to timeout
>
> Before increasing value for wal_sender_timeout and wal_receiver_timeout I 
> thought to further investigate the mechanisms leading to this timeout.
>

The basic problem here seems to be that WAL Sender is not able to send
a keepalive or any other message for the configured
wal_receiver_timeout. I am not sure how that can happen but can you
once try by switching autovacuum = off? I wanted to ensure that
WALSender is not blocked due to the background process autovacuum.

-- 
With Regards,
Amit Kapila.


Reply via email to