On Thu, Dec 26, 2013 at 5:58 PM, Robert Haas <robertmh...@gmail.com> wrote: > While mulling this over further, I had an idea about this: suppose we > marked the tuple in some fashion that indicates that it's a promise > tuple. I imagine an infomask bit, although the concept makes me wince > a bit since we don't exactly have bit space coming out of our ears > there. Leaving that aside for the moment, whenever somebody looks at > the tuple with a mind to calling XactLockTableWait(), they can see > that it's a promise tuple and decide to wait on some other heavyweight > lock instead. The simplest thing might be for us to acquire a > heavyweight lock on the promise tuple before making index entries for > it, and then have callers wait on that instead always instead of > transitioning from the tuple lock to the xact lock.
I think the interlocking with buffer locks and heavyweight locks to make that work could be complex. I'm working on a scheme where we always acquire a page heavyweight lock ahead of acquiring an equivalent buffer lock, and without any other buffer locks held (for the critical choke point buffer, to implement value locking). With my scheme, you may have to retry, but only in the event of page splits and only at the choke point. In any case, what you describe here strikes me as an expansion on the already less than ideal modularity violation within the btree AM (i.e. the way it buffer locks the heap with its own index buffers concurrently for uniqueness checking). It might be that the best argument for explicit value locks (implemented as page heavyweight locks or whatever) is that they are completely distinct to row locks, and are an abstraction managed entirely by the AM itself, quite similar to the historic, limited value locking that unique index enforcement has always used. If we take Heikki's POC patch as representative of promise tuple schemes in general, this scheme might not be good enough. Index tuple insertions don't wait on each other there, and immediately report conflict. We need pre-checking to get an actual conflict TID in that patch, with no help from btree available. I'm generally opposed to making value locks of any stripe be held for more than an instant (so we should not hold them indefinitely pending another conflicting xact finishing). It's not just that it's convenient to my implementation; I also happen to think that it makes no sense. Should you really lock a value in an earlier unique index for hours, pending conflicter xact finishing, because you just might happen to want to insert said value, but probably not? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers