Dear Euler, Thanks for updating the patch! Before reading yours, I wanted to reply some of comments.
> I'm still thinking about replacing --subscriber-conninfo with separate items (username, port, password?, host = socket dir). Maybe it is an overengineering. The user can always prepare the environment to avoid unwanted and/or external connections. > For me, required amount of fixes are not so different from current one. How about others? > It is extremely useful because (a) you have a physical replication setup and don't know if it is prepared for logical replication, (b) check GUCs (is max_wal_senders sufficient for this pg_subscriber command? Or is max_replication_slots sufficient to setup the logical replication even though I already have some used replication slots?), (c) connectivity and (d) credentials. > Yeah, it is useful for verification purpose, so let's keep this option. But I still think the naming should be "--check". Also, there are many `if (!dry_run)` but most of them can be removed if the process exits earlier. Thought? > > 05. > I felt we should accept some settings from enviroment variables, like > pg_upgrade. > Currently, below items should be acceted. > > - data directory > - username > - port > - timeout Maybe PGDATA. > Sorry, I cannot follow this. Did you mean that the target data directory should be able to be specified by PGDATA? OF so, +1. > > 06. > ``` > pg_logging_set_level(PG_LOG_WARNING); > ``` > > If the default log level is warning, there are no ways to output debug logs. > (-v option only raises one, so INFO would be output) > I think it should be PG_LOG_INFO. You need to specify multiple -v options. > Hmm. I felt the specification was bit strange...but at least it must be described on the documentation. pg_dump.sgml has similar lines. > > 08. > Not sure, but if we want to record outputs by pg_subscriber, the sub-directory > should be created. The name should contain the timestamp. The log file already contains the timestamp. Why? > This comment assumed outputs by pg_subscriber were also recorded to a file. In this case and if the file also has the same timestamp, I think they can be gathered in the same place. No need if outputs are not recorded. > > 09. > Not sure, but should we check max_slot_wal_keep_size of primary server? It can > avoid to fail starting of logical replicaiton. A broken physical replication *before* running this tool is its responsibility? Hmm. We might add another check that can be noticed during dry run mode. > I thought that we should not generate any broken objects, but indeed, not sure it is our scope. How do other think? > > 11. > ``` > /* > * Stop the subscriber if it is a standby server. Before executing the > * transformation steps, make sure the subscriber is not running because > * one of the steps is to modify some recovery parameters that require a > * restart. > */ > if (stat(pidfile, &statbuf) == 0) > ``` > > I kept just in case, but I'm not sure it is still needed. How do you think? > Removing it can reduce an inclusion of pidfile.h. Are you suggesting another way to check if the standby is up and running? > Running `pg_ctl stop` itself can detect whether the process has been still alive. It would exit with 1 when the process is not there. > > I didn't notice the comment, but still not sure the reason. Why we must > reserve > the slot until pg_subscriber finishes? IIUC, the slot would be never used, it > is created only for getting a consistent_lsn. So we do not have to keep. > Also, just before, logical replication slots for each databases are created, > so > WAL records are surely reserved. > I want to confirm the conclusion - will you remove the creation of a transient slot? Also, not tested, I'm now considering that we can reuse the primary_conninfo value. We are assuming that the target server is standby and the current upstream one will convert to publisher. In this case, the connection string is already specified as primary_conninfo so --publisher-conninfo may not be needed. The parameter does not contain database name, so --databases is still needed. I imagine like: 1. Parse options 2. Turn on standby 3. Verify the standby 4. Turn off standby 5. Get primary_conninfo from standby 6. Connect to primary (concat of primary_conninfo and an option is used) 7. Verify the primary ... How do you think? Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/global/