Greg Stark wrote: > On Mon, Mar 1, 2010 at 5:50 PM, Josh Berkus <j...@agliodbs.com> wrote: > > I don't think that defer_cleanup_age is a long-term solution. ?But we > > need *a* solution which does not involve delaying 9.0. > > So I think the primary solution currently is to raise max_standby_age. > > However there is a concern with max_standby_age. If you set it to, > say, 300s. Then run a 300s query on the slave which causes the slave > to fall 299s behind. Now you start a new query on the slave -- it gets > a snapshot based on the point in time that the slave is currently at. > If it hits a conflict it will only have 1s to finish before the > conflict causes the query to be cancelled. > > In short in the current setup I think there is no safe value of > max_standby_age which will prevent query cancellations short of -1. If > the slave has a constant stream of queries and always has at least one > concurrent query running then it's possible that the slave will run > continuously max_standby_age-epsilon behind the master and cancel > queries left and right, regardless of how large max_standby_age is.
This is sobering. I have added the attached documentation so at least this odd behavior is documented. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: doc/src/sgml/config.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.257 diff -c -c -r1.257 config.sgml *** doc/src/sgml/config.sgml 2 Mar 2010 21:18:59 -0000 1.257 --- doc/src/sgml/config.sgml 2 Mar 2010 23:34:38 -0000 *************** *** 1862,1879 **** <listitem> <para> When server acts as a standby, this parameter specifies a wait policy ! for queries that conflict with data changes being replayed by recovery. If a conflict should occur the server will delay up to this number ! of seconds before it begins trying to resolve things less amicably, as ! described in <xref linkend="hot-standby-conflict">. Typically, ! this parameter makes sense only during replication, so when ! performing an archive recovery to recover from data loss a very high ! parameter setting or -1 which means wait forever is recommended. ! The default is 30 seconds. Increasing this parameter can delay ! master server changes from appearing on the standby. This parameter can only be set in the <filename>postgresql.conf</> file or on the server command line. </para> </listitem> </varlistentry> --- 1862,1892 ---- <listitem> <para> When server acts as a standby, this parameter specifies a wait policy ! for applying WAL entries that conflict with active queries. If a conflict should occur the server will delay up to this number ! of seconds before it cancels conflicting queries, as ! described in <xref linkend="hot-standby-conflict">. ! Typically, this parameter is used only during replication. ! The default is 30 seconds. This parameter can only be set in the <filename>postgresql.conf</> file or on the server command line. </para> + <para> + A high value makes query cancel less likely, and -1 + causes the standby to wait forever for a conflicting query to + complete. Increasing this parameter might delay master server + changes from appearing on the standby. + </para> + <para> + While it is tempting to believe that <varname>max_standby_delay</> + is the maximum number of seconds a query can run before + cancellation is possible, this is not true. When a long-running + query ends, there is a finite time required to apply backlogged + WAL logs. If a second long-running query appears before the + WAL has caught up, the snapshot taken by the second query will + allow significantly less than <varname>max_standby_delay</> + before query cancellation is possible. + </para> </listitem> </varlistentry>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers