Hi, On 2019-07-19 14:50:22 -0400, Robert Haas wrote: > On Fri, Jul 19, 2019 at 2:04 PM Andres Freund <and...@anarazel.de> wrote: > > It doesn't seem that hard - and kind of required for robustness > > independent of the decision around "completeness" - to find a way to use > > the locks already held by the prepared transaction. > > I'm not wild about finding more subtasks to put on the must-do list, > but I agree it's doable.
Isn't that pretty inherently required? How are otherwise ever going to be able to roll back a transaction that holds an AEL on a relation it also modifies? I might be standing on my own head here, though. > > You could force new connections to complete the rollback processing of > > the terminated connection, if there's too much pending UNDO. That'd be a > > way of providing back-pressure against such crazy scenarios. Seems > > again that it'd be good to have that pressure, independent of the > > decision on completeness. > > That would definitely provide a whole lot of back-pressure, but it > would also make the system unusable if the undo handler finds a way to > FATAL, or just hangs for some stupid reason (stuck I/O?). It would be > a shame if the administrative action needed to fix the problem were > prevented by the back-pressure mechanism. Well, then perhaps that admin ought not to constantly terminate connections... I was thinking that new connections wouldn't be forced to do that if there were still a lot of headroom regarding #transactions-to-be-rolled-back. And if undo workers kept up, you'd also not hit this. > > Couldn't we record the outstanding transactions in the checkpoint, and > > then recompute the changes to that record during WAL replay? > > Hmm, that's not a bad idea. So the transactions would have to "count" > the moment they insert their first undo record, which is exactly the > right thing anyway. > > Hmm, but what about transactions that are only touching unlogged tables? Wouldn't we throw all that UNDO away in a crash restart? There's no underlying table data anymore, after all. And for proper shutdown checkpoints they could just be included. Greetings, Andres Freund