On Thu, Jan 27, 2011 at 09:18:23AM -0800, Jeff Davis wrote:
> On Tue, 2011-01-25 at 05:57 -0500, Dan Ports wrote:
> > This summary is right on. I would add one additional detail or
> > clarification to the last point, which is that rather than checking for
> > a cycle, we're checking for a transaction with both "in" and "out"
> > conflicts, which every cycle must contain.
> 
> To clarify, this means that it will get some false positives, right?

Yes, this is correct.

> Is there a reason we can't check for a real cycle, which would let T1
> succeed?

I talked today with someone who experimented with doing exactly that in
MySQL as part of a research project (Perfectly Serializable Snapshot
Isolation, paper forthcoming in ICDE)

It confirmed my intuition that this is possible but not as
straightforward as it sounds. Some complications I thought of in
adapting that to what we're doing:

1) it requires tracking all edges in the serialization graph; besides
   the rw-conflicts we track there are also the more mundane
   rw-dependencies (if T2 sees an update made by T1, then T2 has to
   come after T1) and ww-dependencies (if T2's write modifies a tuple
   created by T1, then T2 has to come after T1).
   
   We are taking advantage of the fact that any cycle must have two
   adjacent rw-conflict edges, but the rest of the cycle can be
   composed of other types. It would certainly be possible to track the
   others, but it would add a bit more complexity and use more memory.

2) it requires doing a depth-first search to check for a cycle, which
   is more expensive than just detecting two edges. That seems OK if
   you only want to check it on commit, but maybe not if you want to
   detect serialization failures at the time of the conflicting
   read/write (as we do).

3) it doesn't seem to interact very well with our memory mitigation
   efforts, wherein we discard lots of data about committed
   transactions that we know we won't need. If we were checking for
   cycles, we'd need to keep more of it. I suspect summarization (when
   we combine two transactions' predicate locks to save memory) would
   also cause problems.

None of these seem particularly insurmountable, but they suggest it
isn't a clear win to try to find an entire cycle.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to