>>> Martijn van Oosterhout <klep...@svana.org> wrote: > On Mon, Dec 22, 2008 at 11:00:53AM -0600, Kevin Grittner wrote: >> As I've understood limitations of the PostgreSQL implementation of >> SERIALIZABLE transactions, at least the only example given in the >> documentation, revolve around a rather unlikely situation: >> >> Given concurrent transactions T1 and T2 and non-overlapping sets of >> data A and B, T1 reads data including A and uses the data to modify B >> while T2 reads data including B and uses that data to modify A, where >> the modifications performed by either would affect the modifications >> made by the other, if they were visible. > > In so far as the "modifications" are just INSERTs (no UPDATEs or > DELETEs), yes. This case is covered in the documentation. Let's not understate the scope of the issue. UPDATEs and DELETEs count, too. For example: -- connection 1 cir=> create table mytab (class int not null, value int not null); CREATE TABLE cir=> copy mytab from stdin; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> 1 10 >> 1 20 >> 2 100 >> 2 200 >> \. cir=> start transaction isolation level serializable; START TRANSACTION cir=> update mytab set value = (select sum(value) from mytab where class = 2) where class = 1 and value = 10; UPDATE 1 -- connection 2 cir=> start transaction isolation level serializable; START TRANSACTION cir=> update mytab set value = (select sum(value) from mytab where class = 1) where class = 2 and value = 100; UPDATE 1 cir=> commit transaction; COMMIT -- connection 1 cir=> commit transaction; COMMIT cir=> select * from mytab; class | value -------+------- 1 | 20 2 | 200 1 | 300 2 | 30 (4 rows) >> Imagine, as an example, a system which involves recording receipts, >> each of which must go into a daily deposit. There is a control table >> with one row containing the current deposit date for receipts. >> Somewhere mid-afternoon that date is updated, all subsequent receipts >> fall into the new day, and a report is run listing the receipts for the >> day and giving the deposit total. > > This is a variation of the above and has the same "proper" solution: > predicate locking. However, in this case the records in question are > already present so you can workaround it easily. First do a SELECT FOR > UPDATE on all the records you want to update. This will serialize all > parallel transactions to either before or after you. Then do your > update. My point isn't that there aren't workarounds, it is that people might reasonably assume that SERIALIZABLE transactions provide sufficient concurrency control for this, since the only example we give of a problem is a rather contrived update anomaly. The fact that even in cases where the data settles into good form at commit leave windows where race conditions could cause occasional bad results without extra explicit locking is not obvious. >> This absolutely can't happen in a standard-compliant implementation. >> At a minimum, this window where visible data lacks coherency should be >> noted in the documentation. I don't know if there's any way to fix >> this without killing performance. > > Predicate locking is nasty and we don't try. I'm not sure if anybody > else does. I know for a fact that Sybase ASE does. I've heard from reasonably reliable sources that DB2 does. I know that Microsoft SQL Server did for some time after the split from the Sybase code base, but I'm not sure they've continued that; in fact there was a reference to concurrency issues in wikipedia which implied that they no longer do. The implementation is not pretty -- they do it by locking accessed pages and insertion points, and in some cases entire tables. It is so ugly that at one point in discussing similar issues Tom said that it couldn't really qualify as predicate locking, but in the face of the fact that they have covered all the bases to provide true serializable transactions, and that theory says that only predicate locking can do that, he conceded that it was predicate locking -- but really ugly and in a form he would never support. Anyway, I didn't argue that we should provide truly serializable transactions, just that we should provide a less contrived example of where the PostgreSQL implementation can show anomalies, so that people don't get burned through a false sense of security. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers