On 18 October 2013 19:03, Fabrízio de Royes Mello <fabriziome...@gmail.com> wrote:
> The attached patch is a continuation of Robert's work [1]. Reviewing v2... > I made some changes: > - use of Latches instead of pg_usleep, so we don't have to wakeup regularly. OK > - call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because > might change the trigger file's location OK > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and > XLOG_XACT_COMMIT_COMPACT checks Why just those? Why not aborts and restore points also? > - don't care about clockdrift because it's an admin problem. Few minor points on things * The code with comment "Clear any previous recovery delay time" is in wrong place, move down or remove completely. Setting the delay to zero doesn't prevent calling recoveryDelay(), so that logic looks wrong anyway. * The loop exit in recoveryDelay() is inelegant, should break if <= 0 * There's a spelling mistake in sample * The patch has whitespace in one place and one important point... * The delay loop happens AFTER we check for a pause. Which means if the user notices a problem on a commit, then hits pause button on the standby, the pause will have no effect and the next commit will be applied anyway. Maybe just one commit, but its an off by one error that removes the benefit of the patch. So I think we need to test this again after we finish delaying if (xlogctl->recoveryPause) recoveryPausesHere(); We need to explain in the docs that this is intended only for use in a live streaming deployment. It will have little or no meaning in a PITR. I think recovery_time_delay should be called <something>_apply_delay to highlight the point that it is the apply of records that is delayed, not the receipt. And hence the need to document that sync rep is NOT slowed down by setting this value. And to make the name consistent with other parameters, I suggest min_standby_apply_delay We also need to document caveats about the patch, in that it only delays on timestamped WAL records and other records may be applied sooner than the delay in some circumstances, so it is not a way to avoid all cancellations. We also need to document the behaviour of the patch is to apply all data received as quickly as possible once triggered, so the specified delay does not slow down promoting the server to a master. That might also be seen as a negative behaviour, since promoting the master effectively sets recovery_time_delay to zero. I will handle the additional documentation, if you can update the patch with the main review comments. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers