On Wed, Apr 29, 2020 at 9:57 PM Robert Haas <robertmh...@gmail.com> wrote: > > Hi, > > I think it's not good that do_pg_start_backup() takes a flag which > tells it to call back into basebackup.c's sendTablespace(). This means > that details which ought to be private to basebackup.c leak out and > become visible to other parts of the code. This seems to have > originated in commit 72d422a5227ef6f76f412486a395aba9f53bf3f0, and it > looks like there was some discussion of the issue at the time. I think > that patch was right to want only a single iteration over the > tablespace list; if not, the list of tablespaces returned by the > backup could be different from the list that is included in the > tablespace map, which does seem like a good thing to avoid. > > However, it doesn't follow that sendTablespace() needs to be called > from do_pg_start_backup(). It's not actually sending the tablespace at > that point, just calculating the size, because the sizeonly argument > is passed as true. And, there's no reason that I can see why that > needs to be done from within do_pg_start_backup(). It can equally well > be done after that function returns, as in the attached 0001. I > believe that this is functionally equivalent but more elegant, > although there is a notable behavior difference: today, > sendTablespaces() is called in sizeonly mode with "fullpath" as the > argument, which I think is pg_tblspc/$OID, and in non-sizeonly mode > with ti->path as an argument, which seems to be the path to which the > symlink points. With the patch, it would be called with the latter in > both cases. It looks to me like that should be OK, and it definitely > seems more consistent. >
If we want to move the calculation of size for tablespaces in the caller then I think we also need to do something about the progress reporting related to phase PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE. > While I was poking around in this area, I found some other code which > I thought could stand a bit of improvement also. The attached 0002 > slightly modifies some tablespace_map related code and comments in > perform_base_backup(), so that instead of having two very similar > calls to sendDir() right next to each other that differ only in the > value passed for the fifth argument, we have just one call with the > fifth argument being a variable. Although this is a minor change I > think it's a good cleanup that reduces the chances of future mistakes > in this area. > The 0002 patch looks good to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com