On Thu, Jan 27, 2011 at 09:18:23AM -0800, Jeff Davis wrote: > On Tue, 2011-01-25 at 05:57 -0500, Dan Ports wrote: > > This summary is right on. I would add one additional detail or > > clarification to the last point, which is that rather than checking for > > a cycle, we're checking for a transaction with both "in" and "out" > > conflicts, which every cycle must contain. > > To clarify, this means that it will get some false positives, right?
Yes, this is correct. > Is there a reason we can't check for a real cycle, which would let T1 > succeed? I talked today with someone who experimented with doing exactly that in MySQL as part of a research project (Perfectly Serializable Snapshot Isolation, paper forthcoming in ICDE) It confirmed my intuition that this is possible but not as straightforward as it sounds. Some complications I thought of in adapting that to what we're doing: 1) it requires tracking all edges in the serialization graph; besides the rw-conflicts we track there are also the more mundane rw-dependencies (if T2 sees an update made by T1, then T2 has to come after T1) and ww-dependencies (if T2's write modifies a tuple created by T1, then T2 has to come after T1). We are taking advantage of the fact that any cycle must have two adjacent rw-conflict edges, but the rest of the cycle can be composed of other types. It would certainly be possible to track the others, but it would add a bit more complexity and use more memory. 2) it requires doing a depth-first search to check for a cycle, which is more expensive than just detecting two edges. That seems OK if you only want to check it on commit, but maybe not if you want to detect serialization failures at the time of the conflicting read/write (as we do). 3) it doesn't seem to interact very well with our memory mitigation efforts, wherein we discard lots of data about committed transactions that we know we won't need. If we were checking for cycles, we'd need to keep more of it. I suspect summarization (when we combine two transactions' predicate locks to save memory) would also cause problems. None of these seem particularly insurmountable, but they suggest it isn't a clear win to try to find an entire cycle. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers