I'm still casting about to make sure I have my head around the issues adequately to suggest a documentation update. Here's my current understanding. The below is intended to help define the nature and scope of the issue, not be the sort of text that belongs in user documentation. Assume transactions T0, T1, and TN, where TN can be T1 or a different transaction, and T0 is concurrent with T1 and TN. Assume non-overlapping logical sets of data x and y (either or both of which can be empty or not). Differences in behavior between what is required by the standard for serializable transactions and what occurs under snapshot isolation can manifest when T0 reads x and modifies y based on what was read from x, T1 modifies x in a way that would affect T0's modification to y, TN reads y, and T1's modifications are visible to TN, either because it is the same transaction or because T1 committed before TN got its snapshot. I think that transient SELECT anomalies can occur if TN also reads x without modifying anything based on y. My example of an incomplete receipt journal is a case of this. I think that persistent data integrity can be compromised when TN modifies data based on y. The example currently in the documentation is a special case of this, where the modified data is part of x. Another example would be where someone attempted to maintain referential integrity using snapshot isolation without additional locks, T1 is TN, and one transaction checked for children before deleting the parent while the other checked for a parent before inserting a child. Many other types of integrity enforcement would fail if serializable behavior is assumed, such as ensuring the existence of some minimum number of rows which meet certain criteria (e.g., staffing) or ensuring that the sum of some numeric column for a range of rows stays above a minimum amount (e.g., banking). My initial intuition that simply applying the same locks which guarantee serializable behavior in a non-snapshot database to a snapshot database would guarantee integrity is just plain wrong. There are a number of academic papers suggesting techniques for handling these issues. Unfortunately, what I've seen so far suggests that there are three main approaches available under PostgreSQL, all with significant cost of one type or another. (1) A rigorous analysis of all queries allowed against the database to determine where transaction conflicts can occur, with expertise needed to resolve each. Downsides are that ad hoc queries can cause anomalies and that when new queries are introduced a time-intensive review of all queries may need to be done, and locking changes may be needed in programs which weren't otherwise modified. (2) A less rigorous examination might suffice, if rather brutal table-level locks are applied mechanically. Even this requires knowledge of what's happening outside the area where the change is being made. If the receipt example is solved by adding a table lock on the receipt table to the code which updates the control record, the control record update must be modified if some other code modifies some new table (say, receipt_voids) based on looking at the control record to determine what to do. (3) A finer-grained approach would be to make no-effect updates to rows to lock them if they are to be read for purposes of updating something else in the transaction. This could have a high cost in disk access and table bloat. It has the advantage of providing a simple technique which, if applied consistently, doesn't require knowledge of software beyond what is under development. Of course, if the anomalies are infrequent enough and of a nature which can be tolerated, the whole issue can be ignored, or monitored for manual correction. Any suggestions for refining the description of where the anomalies can occur or how best to work around any particular classes of them are welcome. I'm particularly interested in practical methodologies for ensuring that no holes are left for failures of business rules to apply. There are so many papers on the topic that I'm not sure where to go next. I hope someone can show me something good I've missed so far. I haven't lost track of the language suggested by Robert Haas, which I think frames the issue nicely. I just want to follow it with a reasonable description of where it's an issue and how to handle it. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers