Greetings, * Bruce Momjian (br...@momjian.us) wrote: > We currently can check for missing heap/index files by comparing > pg_class with the database directory files. However, I am not clear if > this is safe during concurrent DDL. I assume we create the file before > the update to pg_class is visible, but do we always delete the file > after the update to pg_class is visible? I assume any external checking > tool would need to lock the relation to prevent concurrent DDL.
It'd sure be nice if an external tool (such as one trying to back up the database..) could get this full list *without* having to run around and lock everything. This is because of some fun discoveries that have been made around readdir() not always being entirely honest. Take a look at: https://github.com/pgbackrest/pgbackrest/issues/1754 and https://gitlab.alpinelinux.org/alpine/aports/-/issues/10960 TL;DR: if you're removing files from a directory that you've got an active readdir() running through, you might not actually get all of the *existing* files. Given that PG is happy to remove files from PGDATA while a backup is running, in theory this could lead to a backup utility like pgbackrest or pg_basebackup not actually backing up all the files. Now, pgbackrest runs the readdir() very quickly to build a manifest of all of the files to backup, minimizing the window for this to possibly happen, but pg_basebackup keeps a readdir() open during the entire backup, making this more possible. > Also, how would it check if the number of extents is correct? Seems we > would need this value to be in pg_class, and have the same update > protections outlined above. Seems that would require heavier locking. Would be nice to have but also would be expensive to maintain.. Thanks, Stephen
signature.asc
Description: PGP signature