Frank Wittig <[EMAIL PROTECTED]> writes: > The problem is that the slave server stops checkpointing after some > hours of working (about 24 to 48 hours of conitued log replay).
Hm ... look at RecoveryRestartPoint() in xlog.c. Could there be something wrong with this logic? /* * Do nothing if the elapsed time since the last restartpoint is less than * half of checkpoint_timeout. (We use a value less than * checkpoint_timeout so that variations in the timing of checkpoints on * the master, or speed of transmission of WAL segments to a slave, won't * make the slave skip a restartpoint once it's synced with the master.) * Checking true elapsed time keeps us from doing restartpoints too often * while rapidly scanning large amounts of WAL. */ elapsed_secs = time(NULL) - ControlFile->time; if (elapsed_secs < CheckPointTimeout / 2) return; The idea is that the slave (once in sync with the master) ought to checkpoint every time it sees a checkpoint record in the master's output. I'm not seeing a flaw but maybe there is one here, or somewhere nearby. Are you sure the master is checkpointing? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster