marcin mank wrote:
Perhaps we should listen to the people that have said they don't want
queries cancelled, even if the alternative is inconsistent answers.

I don't like that much. PostgreSQL has traditionally avoided that very hard. It's hard to tell what kind of inconsistencies you'd get, as it'd depend on what plan is created, when a vacuum happens to run on master etc.

I think an alternative to that would be "if the wal backlog is too
big, let current queries finish and let incoming queries wait till the
backlog gets smaller".

Yeah, that makes sense too.

Many approaches have been proposed, and they all have different tradeoffs and therefore fit different use cases. I'm not sure which ones are/will be included in the patch. We don't need all in 8.4, one or two simplest ones will do just fine, and we can extend later.

Let me summarize. Whenever a WAL record conflicts with a query-in-progress, we can:

1. kill the query, or
2. wait for the query to finish
3. let the query proceed, producing invalid results.

There's some combinations of those as well. You're proposal is a variation of 2, to avoid the problem of WAL application falling behind indefinitely. There's also the max_standby_delay option in the patch, to wait a while, and then kill the query.

There's some additional optimizations that can be made to make those options less painful. Instead of killing all queries that might be affected by a vacuum record, only kill them when they actually hit a block that was vacuumed (Simon's idea of latestRemovedLSN field in page header).

Another line of attack is to avoid getting into the situation in the first place, by affecting behavior on the master. If the standby has an online connection to the master (per the synch rep patch), it can tell master what the slave's OldestXmin is, and master can take that into account and not remove tuples still needed by the slave. That's not good from high availability point of view, you don't want a hung query in the slave to cause a long-running-transaction situation in the master, but for other use cases it would be fine. Or we can just add a constant # of transactions to OldestXmin in master, to get some breathing room in the server.

The bottom line is that we have enough options to make everyone happy. Some understanding of the issue is required to tune it properly, however, so documentation is important.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to