I noticed an odd buildfarm failure today:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2019-03-16%2012%3A12%3A20

of which the key bit seems to be

2019-03-16 15:20:43.835 UTC [10879304] 003_promote.pl LOG:  received 
replication command: BASE_BACKUP LABEL 'pg_basebackup base backup'    NOWAIT   
2019-03-16 15:20:45.857 UTC [10879304] 003_promote.pl ERROR:  could not request 
checkpoint because checkpointer not running
2019-03-16 15:20:47.227 UTC [61604144] LOG:  received immediate shutdown request

Digging in the buildfarm archives finds seven other occurrences of the
same error in the past three months (I didn't look back further).

The cause of this error is that RequestCheckpoint will give up and fail
after just 2 seconds, which evidently is not long enough on slow or
heavily loaded machines.  Since there isn't any good reason why the
checkpointer wouldn't be running, I'm inclined to swing a large hammer
and kick this timeout up to 60 seconds.  Thoughts?

                        regards, tom lane

Reply via email to