Fujii-san, thank you for comments.

>The cause of this problem is that the checkpointer's sleep time is calculated
>from both checkpoint_timeout and archive_timeout during normal running,
>but calculated only from checkpoint_timeout during recovery. So Daisuke-san's
>patch tries to change that so that it's calculated from both of them even
>during recovery. No?

Yes, it's exactly so.

>last_xlog_switch_time is not updated during recovery. So "elapsed_secs" can be
>large and cur_timeout can be negative. Isn't this problematic?

Yes... My patch was missing this.
How about using the original archive_timeout value for calculating cur_timeout 
during recovery?

                if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
                {
                        elapsed_secs = now - last_xlog_switch_time;
                        if (elapsed_secs >= XLogArchiveTimeout)
                                continue;               /* no sleep for us ... 
*/
                        cur_timeout = Min(cur_timeout, XLogArchiveTimeout - 
elapsed_secs);
                }
+               else if (XLogArchiveTimeout > 0)
+                       cur_timeout = Min(cur_timeout, XLogArchiveTimeout);

During recovery, accurate cur_timeout is not calculated because elapsed_secs is 
not used.
However, after recovery is complete, WAL archiving will start by the next 
archive_timeout is reached.
I felt it is enough to solve this problem.

>As another approach, what about waking the checkpointer up at the end of
>recovery like we already do for walsenders?

If the above solution is not good, I will consider this approach.

Regards,
Daisuke, Higuchi


Reply via email to