On Thu, Aug 8, 2019 at 9:31 AM Andres Freund <and...@anarazel.de> wrote: > I know that Robert is working on a patch that revises the undo request > layer somewhat, it's possible that this is best discussed afterwards.
Here's what I have at the moment. This is not by any means a complete replacement for Amit's undo worker machinery, but it is a significant redesign (and I believe a significant improvement to) the queue management stuff from Amit's patch. I wrote this pretty quickly, so while it passes simple testing, it probably has a number of bugs, and to actually use it, it would need to be integrated with xact.c; right now it's just a standalone module that doesn't do anything except let itself be tested. Some of the ways it is different from Amit's patches include: * It uses RBTree rather than binaryheap, so when we look ahead, we look ahead in the right order. * There's no limit to the lookahead distance; when looking ahead, it will search the entirety of all 3 RBTrees for an entry from the right database. * It doesn't have a separate hash table keyed by XID. I didn't find that necessary. * It's better-isolated, as you can see from the fact that I've included a test module that tests this code without actually ever putting an UndoRequestManager in shared memory. I would've liked to expand this test module, but I don't have time to do that today and felt it better to get this much sent out. * It has a lot of comments explaining the design and how it's intended to integrate with the rest of the system. Broadly, my vision for how this would get used is: - Create an UndoRecordManager in shared memory. - Before a transaction first attaches to a permanent or unlogged undo log, xact.c would call RegisterUndoRequest(); thereafter, xact.c would store a pointer to the UndoRecord for the lifetime of the toplevel transaction. - Immediately after attaching to a permanent or unlogged undo log, xact.c would call UndoRequestSetLocation. - xact.c would track the number of bytes of permanent and unlogged undo records the transaction generates. If the transaction goes onto abort, it reports these by calling FinalizeUndoRequest. - If the transaction commits, it doesn't need that information, but does need to call UnregisterUndoRequest() as a post-commit step in CommitTransaction(). - In the case of an abort, after calling FinalizeUndoRequest, xact.c would call PerformUndoInBackground() to find out whether to do undo in the background or the foreground. If undo is to be done in the foreground, the backend must go on to call UnregisterUndoRequest() if undo succeeds, and RescheduleUndoRequest() if it fails. - In the case of a prepared transaction, a pointer to the UndoRequest would get stored in the GlobalTransaction (but nothing extra would get stored in the twophase state file). - COMMIT PREPARED calls UnregisterUndoRequest(). - ROLLBACK PREPARED calls PerformUndoInBackground; if told to do undo in the foreground, it must go on to call either UnregisterUndoRequest() on success or RescheduleUndoRequest() on failure, just like in the regular abort case. - After a crash, once recovery is complete but before we open for connections, or at least before we allow any new undo activity, the discard worker scans all the logs and makes a bunch of calls to RecreateUndoRequest(). Then, for each prepared transaction that still exists, it calls SuspendPreparedUndoRequest() and use the return value to reset the UndoRequest pointer in the GlobalTransaction. Only once both of those steps are completed can undo workers be safely started. - Undo workers call GetNextUndoRequest() to get the next task that they should perform, and once they do, they "own" the undo request. When undo succeeds or fails, they must call either UnregisterUndoRequest() or RescheduleUndoRequest(), as appropriate, just like for foreground undo. Making sure this is water-tight will probably require some well-done integration with xact.c, so that an undo request that we "own" because we got it in a background undo apply process looks exactly the same as one we "own" because it's our transaction originally. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
0001-Draft-of-new-undo-request-manager.patch
Description: Binary data