On Wed, Feb 16, 2022 at 10:59:38PM -0800, Andres Freund wrote: > On 2022-02-16 20:14:04 -0800, Nathan Bossart wrote: >> >> - while ((spc_de = ReadDirExtended(spc_dir, "pg_tblspc", LOG)) != NULL) >> >> + while (!ShutdownRequestPending && >> >> + (spc_de = ReadDirExtended(spc_dir, "pg_tblspc", LOG)) != >> >> NULL) >> > >> > Uh, huh? It strikes me as a supremely bad idea to have functions *silently* >> > not do their jobs when ShutdownRequestPending is set, particularly without >> > a >> > huge fat comment. >> >> The idea was to avoid delaying shutdown because we're waiting for the >> custodian to finish relatively nonessential tasks. Another option might be >> to just exit immediately when the custodian receives a shutdown request. > > I think we should just not do either of these and let the functions > finish. For the cases where shutdown really needs to be immediate > there's, uhm, immediate mode shutdowns.
Alright. >> > Why does this not open us up to new xid wraparound issues? Before there >> > was a >> > hard bound on how long these files could linger around. Now there's not >> > anymore. >> >> Sorry, I'm probably missing something obvious, but I'm not sure how this >> adds transaction ID wraparound risk. These files are tied to LSNs, and >> AFAIK they won't impact slots' xmins. > > They're accessed by xid. The LSN is just for cleanup. Accessing files > left over from a previous transaction with the same xid wouldn't be > good - we'd read wrong catalog state for decoding... Okay, that part makes sense to me. However, I'm still confused about how this is handled today and why moving cleanup to a separate auxiliary process makes matters worse. I've done quite a bit of reading, and I haven't found anything that seems intended to prevent this problem. Do you have any pointers? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com