On Mon, Mar 1, 2010 at 5:50 PM, Josh Berkus <j...@agliodbs.com> wrote: > I don't think that defer_cleanup_age is a long-term solution. But we > need *a* solution which does not involve delaying 9.0.
So I think the primary solution currently is to raise max_standby_age. However there is a concern with max_standby_age. If you set it to, say, 300s. Then run a 300s query on the slave which causes the slave to fall 299s behind. Now you start a new query on the slave -- it gets a snapshot based on the point in time that the slave is currently at. If it hits a conflict it will only have 1s to finish before the conflict causes the query to be cancelled. In short in the current setup I think there is no safe value of max_standby_age which will prevent query cancellations short of -1. If the slave has a constant stream of queries and always has at least one concurrent query running then it's possible that the slave will run continuously max_standby_age-epsilon behind the master and cancel queries left and right, regardless of how large max_standby_age is. To resolve this I think you would have to introduce some chance for the slave to catch up. Something like refusing to use a snapshot older than max_standby_age/2 and instead wait until the existing queries finish and the slave gets a chance to catch up and see a more recent snapshot. The problem is that this would result in very unpredictable and variable response times from the slave. A single long-lived query could cause replay to pause for a big chunk of max_standby_age and prevent any new query from starting. Does anyone see any way to guarantee that the slave gets a chance to replay and new snapshots will become visible without freezing out new queries for extended periods of time? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers