On Tue, Mar 27, 2018 at 10:55 AM, Michael Paquier <mich...@paquier.xyz> wrote: > On Tue, Mar 27, 2018 at 01:32:41AM +0900, Fujii Masao wrote: >> +1. It's better for us to focus on the code change of the fillter on >> pg_rewind >> rather than such "refactoring". > > (filter takes one 'l', not two) > > Okay. I had my mind mostly focused on how to shape the exclusion list > and get it shared between the base backup and pg_rewind, so let's move > on with the duplicated list for now. I did not put much efforts into > the pg_rewind portion to be honest. > >> As I told upthread, the previous patch has the >> problem where the files which should be skipped are not skipped. ISTM that, >> in pg_rewind, the filter should be triggered in recurse_dir() not >> process_source_file(). > > If you put that into recurse_dir you completely ignore the case where > changes are fetched by libpq. Doing the filtering when processing the > file map has the advantage to take care of both the local and remote > cases, which is why I am doing it there.
OK. >> BTW what should pg_rewind do when it finds the directory which should be >> skipped, in the source directory? In your patch, pg_rewind just tries to skip >> that directory at all. But isn't this right behavior? If that directory >> doesn't >> exist in the target directory (though I'm not sure if this situation is >> really >> possible), I'm thinking that pg_rewind should create that "empty" directory >> in the target. No? > > I am not exactly sure what you are coming up with here. The target > server should have the same basic directory mapping as the source as the > target has been initialized normally with initdb or a base backup from > another node, so checking for the *contents* of directories is enough > and keeps the code more simple, as the exclude filter entries are based > on elements inherent to PostgreSQL internals. Please note as well that > if a non-system directory is present on the source but not the target > then it would get created on the target. > > At the end I have finished with the attached. Thanks for the patch! + snprintf(localpath, sizeof(localpath), "%s/", + excludeDirContents[excludeIdx]); + if (strstr(path, localpath) != NULL) This code is almost ok in practice, but using the check of "strstr(path, localpath) == path" is more robust here? + for (excludeIdx = 0; excludeFiles[excludeIdx] != NULL; excludeIdx++) + { + if (strstr(path, excludeFiles[excludeIdx]) != NULL) Using the following code instead is more robust? This original code is almost ok in practice, though. filename = last_dir_separator(path); if (filename == NULL) filename = path; else filename++; if (strcmp(filename, excludeFiles[excludeIdx]) == 0) + (everything except the relation files). Similarly to base backups, + the contents of the directories <filename>pg_dynshmem/</filename>, + <filename>pg_notify/</filename>, <filename>pg_replslot/</filename>, + <filename>pg_serial/</filename>, <filename>pg_snapshots/</filename>, + <filename>pg_stat_tmp/</filename>, and + <filename>pg_subtrans/</filename> are omitted from the data copied + from the source cluster. Any file or directory beginning with + <filename>pgsql_tmp</filename> is omitted, as well as are + <filename>backup_label</filename>, + <filename>tablespace_map</filename>, + <filename>pg_internal.init</filename>, + <filename>postmaster.opts</filename> and + <filename>postmaster.pid</filename>. I don't think this description is necessary. The doc for pg_basebackup also doesn't seem to have this kind of description. So attached is the patch that I updated based on your patch and am thinking to apply. Regards, -- Fujii Masao
add_exclude_list_similar_to_base_backup_v2_fujii.patch
Description: Binary data