On Tue, Sep 4, 2012 at 7:07 AM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > On 03.09.2012 10:43, Fujii Masao wrote: >> >> On Sat, Sep 1, 2012 at 2:32 AM, Fujii Masao<masao.fu...@gmail.com> wrote: >>> >>> On Fri, Aug 31, 2012 at 5:03 PM, Heikki Linnakangas<hlinn...@iki.fi> >>> wrote: >>>> >>>> Aside from the missing locking, I wonder what that does to a cascaded >>>> >>>> standby. If there is an active walsender running while RecoveryTargetTLI >>>> is >>>> changed, I think what will happen is that the walsender will continue to >>>> stream WAL from the old timeline, but because the startup process is now >>>> actually replaying from a different timeline, the walsender will send >>>> bogus >>>> WAL to the standby. >>> >>> >>> Good catch! That's really problem. To address that, we should terminate >>> all cascading walsenders when the timeline history file is read and >>> the recovery target timeline is changed? >> >> >> This is not right fix. After terminating cascading walsenders, it >> might take them >> some time to come to an end, and during that time they might send bogus >> WAL >> from old timeline. Currently there is no safeguard against sending bogus >> WAL >> from old timeline. To implement such a safeguard, cascading walsender >> needs >> to know when the timeline is updated and which is the last valid WAL file >> of >> the timeline as the startup process knows. IOW, we need to change >> cascading >> walsenders so that they also read and understand the timeline history >> files. >> This is not easy fix at this stage (9.2.0 is about to be released...). >> >> So, as one idea, I'm thiking to just forbid cascading replication when >> recovery_target_timeline is set to 'latest'. Thought? > > > Hmm, I was thinking that when walsender gets the position it can send the > WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the current > recovery timeline. If it has changed, refuse to send the new WAL and > terminate. That would be a fairly small change, it would just close the > window between requesting walsenders to terminate and them actually > terminating.
Yeah, sounds good. Could you implement the patch? If you don't have time, I will.... Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers