On 2020/03/25 19:42, Sergei Kornilov wrote:
Hi
Could we add a few words in func.sgml to clarify the behavior? Especially for
users from my example above. Something like:
If a promotion is triggered while recovery is paused, the paused state ends,
replay of any WAL immediately available in the archive or in pg_wal will be
continued and then a promotion will be completed.
This description is true if pause is requested by pg_wal_replay_pause(),
but not if recovery target is reached and pause is requested by
recovery_target_action=pause. In the latter case, even if there are WAL data
avaiable in pg_wal or archive, they are not replayed, i.e., the promotion
completes immediately. Probably we should document those two cases
explicitly to avoid the confusion about a promotion and recovery pause?
This is description for pg_wal_replay_pause, but actually we suggest to call
pg_wal_replay_resume in recovery_target_action=pause... So, I agree, we need to
document both cases.
PS: I think we have inconsistent behavior here... Read wal during promotion
from local pg_wal AND call restore_command, but ignore walreceiver also seems
strange for my DBA hat...
If we don't ignore walreceiver and does try to connect to the master,
a promotion and recovery cannot end forever since new WAL data can
be streamed. You think this behavior is more consistent?
IMO it's valid to replay all the WAL data available to avoid data loss
before a promotion completes.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters