On Sat, Jul 20, 2019 at 11:28 AM Peter Geoghegan <p...@bowt.ie> wrote: > On Fri, Jul 19, 2019 at 4:14 PM Robert Haas <robertmh...@gmail.com> wrote: > > I don't think this matters here at all. As long as there's only DML > > involved, there won't be any lock conflicts anyway - everybody's > > taking RowExclusiveLock or less, and it's all fine. If you update a > > row in zheap, abort, and then try to update again before the rollback > > happens, we'll do a page-at-a-time rollback in the foreground, and > > proceed with the update; when we get around to applying the undo, > > we'll notice that page has already been handled and skip the undo > > records that pertain to it. To get the kinds of problems I'm on about > > here, somebody's got to be taking some more serious locks. > > If I'm not mistaken, you're tacitly assuming that you'll always be > using zheap, or something sufficiently similar to zheap. It'll > probably never be possible to UNDO changes to something like a GIN > index on a zheap table, because you can never do that with sensible > concurrency/deadlock behavior. > > I don't necessarily have a problem with that. I don't pretend to > understand how much of a problem it is. Obviously it partially depends > on what your ambitions are for this infrastructure. Still, assuming > that I have it right, ISTM that UNDO/zheap/whatever should explicitly > own this restriction.
I had a similar thought: you might regret that choice if you were wanting to implement an AM with lock table-based concurrency control (meaning that there are lock ordering concerns for row and page locks, for DML statements, not just DDL). That seemed a bit too far fetched to mention before, but are you saying the same sort of concerns might come up with indexes that support true undo (as opposed to indexes that still need VACUUM)? For comparison, ARIES[1] has no-deadlock rollbacks as a basic property and reacquires locks during restart before new transactions are allow to execute. In its model, the locks in question can be on things like rows and pages. We don't even use our lock table for those (except for non-blocking SIREAD locks, irrelevant here). After crash recovery, if zheap encounters a row with pending rollback from an aborted transaction, as usual it either needs to read an older version from an undo log (for reads) or help execute the rollback before updating (for writes). That only requires page-at-a-time LWLocks ("latching"), so it's deadlock-free. The only deadlock risk comes from the need to acquire heavyweight locks on relations which typically only conflict when you run DDL, so yeah, it's tempting to worry a lot less about those than the fine grained lock traffic from DML statements that DB2 and others have to deal with. So spell out the two options again: A. Rollback can't deadlock. You have to make sure you reliably hold locks until rollback is completed (including some tricky new lock transfer magic), and then reacquire them after recovery before new transactions are allowed. You could trivially achieve the restart part by simply waiting until all rollback is executed before you allow new transactions, but other systems including DB2 first acquire all the locks in an earlier scan through the log, then allow new connections, and then execute the rollback. Acquiring them before new transactions are allowed means that they must fit in the lock table and there must be no conflicts among them if they were all granted as at the moment you crashed or shut down. B. Rollback can deadlock or exhaust the lock table because we release and reacquire some arbitrary time later. No choice but to keep retrying if anything goes wrong, and rollback is theoretically not guaranteed to complete and you can contrive a workload that will never make progress. This amounts to betting that these problems will be rare enough that it doesn't matter and eventually make progress, and it should be fairly clear what's happening and why. I might as well put the quote marks on now: "Perhaps we could implement A later." [1] https://cs.stanford.edu/people/chrismre/cs345/rl/aries.pdf -- Thomas Munro https://enterprisedb.com