Hi, On 2022-07-03 10:07:54 -0700, Nathan Bossart wrote: > Thanks for the prompt review. > > On Sat, Jul 02, 2022 at 03:54:56PM -0700, Andres Freund wrote: > > On 2022-07-02 15:05:54 -0700, Nathan Bossart wrote: > >> + /* Obtain requested tasks */ > >> + SpinLockAcquire(&CustodianShmem->cust_lck); > >> + flags = CustodianShmem->cust_flags; > >> + CustodianShmem->cust_flags = 0; > >> + SpinLockRelease(&CustodianShmem->cust_lck); > > > > Just resetting the flags to 0 is problematic. Consider what happens if > > there's > > two tasks and and the one processed first errors out. You'll loose > > information > > about needing to run the second task. > > I think we also want to retry any failed tasks.
I don't think so, at least not if it's just going to retry that task straight away - then we'll get stuck on that one task forever. If we had the ability to "queue" it the end, to be processed after other already dequeued tasks, it'd be a different story. > The way v6 handles this is by requesting all tasks after an exception. Ick. That strikes me as a bad idea. > >> +/* > >> + * RequestCustodian > >> + * Called to request a custodian task. > >> + */ > >> +void > >> +RequestCustodian(int flags) > >> +{ > >> + SpinLockAcquire(&CustodianShmem->cust_lck); > >> + CustodianShmem->cust_flags |= flags; > >> + SpinLockRelease(&CustodianShmem->cust_lck); > >> + > >> + if (ProcGlobal->custodianLatch) > >> + SetLatch(ProcGlobal->custodianLatch); > >> +} > > > > With this representation we can't really implement waiting for a task or > > such. And it doesn't seem like a great API for the caller to just specify a > > mix of flags. > > At the moment, the idea is that nothing should need to wait for a task > because the custodian only handles things that are relatively non-critical. Which is just plainly not true as the patchset stands... I think we're going to have to block if some cleanup as part of a checkpoint hasn't been completed by the next checkpoint - otherwise it'll just end up being way too confusing and there's absolutely no backpressure anymore. > If that changes, this could probably be expanded to look more like > RequestCheckpoint(). > > What would you suggest using instead of a mix of flags? I suspect an array of tasks with requested and completed counters or such? With a condition variable to wait on? > >> + /* let the custodian know what it can remove */ > >> + CustodianSetLogicalRewriteCutoff(cutoff); > > > > Setting this variable in a custodian datastructure and then fetching it from > > there seems architecturally wrong to me. > > Where do you think it should go? I previously had it in the checkpointer's > shared memory, but you didn't like that the functions were declared in > bgwriter.h (along with the other checkpoint stuff). If the checkpointer > shared memory is the right place, should we create checkpointer.h and use > that instead? Well, so far I have not understood what the whole point of the shared state is, so i have a bit of a hard time answering this ;) > >> +/* > >> + * Remove all mappings not needed anymore based on the logical restart > >> LSN saved > >> + * by the checkpointer. We use this saved value instead of calling > >> + * ReplicationSlotsComputeLogicalRestartLSN() so that we don't interfere > >> with an > >> + * ongoing call to CheckPointLogicalRewriteHeap() that is flushing > >> mappings to > >> + * disk. > >> + */ > > > > What interference could there be? > > My concern is that the custodian could obtain a later cutoff than what the > checkpointer does, which might cause files to be concurrently unlinked and > fsync'd. If we always use the checkpointer's cutoff, that shouldn't be a > problem. This could probably be better explained in this comment. How about having a Datum argument to RequestCustodian() that is forwarded to the task? > >> +void > >> +RemoveOldLogicalRewriteMappings(void) > >> +{ > >> + XLogRecPtr cutoff; > >> + DIR *mappings_dir; > >> + struct dirent *mapping_de; > >> + char path[MAXPGPATH + 20]; > >> + bool value_set = false; > >> + > >> + cutoff = CustodianGetLogicalRewriteCutoff(&value_set); > >> + if (!value_set) > >> + return; > > > > Afaics nothing clears values_set - is that a good idea? > > I'm using value_set to differentiate the case where InvalidXLogRecPtr means > the checkpointer hasn't determined a value yet versus the case where it > has. In the former, we don't want to take any action. In the latter, we > want to unlink all the files. Since we're moving to a request model for > the custodian, I might be able to remove this value_set stuff completely. > If that's not possible, it probably deserves a better comment. It would. Greetings, Andres Freund