On Fri, Jul 8, 2022 at 10:44 PM Nathan Bossart <nathandboss...@gmail.com> wrote: > > > 0002 - I'm not quite happy with this patch, with the change, > > checkpoint errors out, if it can't remove just a file - the comments > > there says it all. Is there any strong reason for this change? > > Andres noted several concerns upthread. In short, ignoring unexpected > errors makes them harder to debug and likely masks bugs. > > FWIW I agree that it is unfortunate that a relatively non-critical error > here leads to checkpoint failures, which can cause much worse problems down > the road. I think this is one of the reasons for moving tasks like this > out of the checkpointer, as I'm trying to do with the proposed custodian > process [0]. > > [0] https://commitfest.postgresql.org/38/3448/
IMHO, we can keep it as-is and if required can be changed as part of the patch set [0], as this change without [0] can cause a checkpoint to fail. Furthermore, I would like it if we convert "could not parse filename" and "could not remove file" ERRORs of CheckPointLogicalRewriteHeap to LOGs until [0] gets in - others may have different opinions though. Just wondering - do we ever have a problem if we can't remove the snapshot or mapping file? May be unrelated, RemoveTempXlogFiles doesn't even bother to check if the temp wal file is removed. Regards, Bharath Rupireddy.