On Sat, Oct 15, 2022 at 12:03 AM Nathan Bossart <nathandboss...@gmail.com> wrote: > > On Fri, Oct 14, 2022 at 02:15:19PM +0530, Bharath Rupireddy wrote: > > Given that temp file name includes WAL file name, epoch to > > milliseconds scale and MyProcPid, can there be name collisions after a > > server crash or even when multiple servers with different pids are > > archiving/copying the same WAL file to the same directory? > > While unlikely, I think it's theoretically possible.
Can you please help me understand how name collisions can happen with temp file names including WAL file name, timestamp to millisecond scale, and PID? Having the timestamp is enough to provide a non-unique temp file name when PID wraparound occurs, right? Am I missing something here? > > What happens to the left-over temp files after a server crash? Will > > they be lying around in the archive directory? I understand that we > > can't remove such files because we can't distinguish left-over files > > from a crash and the temp files that another server is in the process > > of copying. > > The temporary files are not automatically removed after a crash. The > documentation for basic archive has a note about this [0]. Hm, we cannot remove the temp file for all sorts of crashes, but having on_shmem_exit() or before_shmem_exit() or atexit() or any such callback removing it would help us cover some crash scenarios (that exit with proc_exit() or exit()) at least. I think the basic_archive module currently leaves temp files around even when the server is restarted legitimately while copying to or renaming the temp file, no? I can quickly find these exit callbacks deleting the files: atexit(cleanup_directories_atexit); atexit(remove_temp); before_shmem_exit(ReplicationSlotShmemExit, 0); before_shmem_exit(logicalrep_worker_onexit, (Datum) 0); before_shmem_exit(BeforeShmemExit_Files, 0); -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com