Hi

I have setup a master/standby on PostgreSQL95 on two test servers and trialing 
out repmgr. (https://github.com/2ndQuadrant/repmgr/)

I am testing a switchover using the following:

-bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf -C /etc/repmgr/9.5/repmgr.conf 
standby switchover -L DEBUG -v

The switchover appears to hang at the last part of the switchover process....

NOTICE: restarting server using '/usr/pgsql-9.5/bin/pg_ctl  -w -D 
/var/lib/pgsql/9.5/data -m fast restart'
pg_ctl: PID file "/var/lib/pgsql/9.5/data/postmaster.pid" does not exist
Is server running?
starting server anyway

It appears to have worked though as when I run the cluster show command on both 
servers it showing the switchover.
-bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf cluster show
Role      | Name           | Upstream       | Connection String
----------+----------------|----------------|-------------------------------------------
* master  | itupl-postgen2 |                | host=10.70.3.252 dbname=repmgr 
user=repmgr
  standby | itupl-postgen1 | itupl-postgen2 | host=10.70.3.251 dbname=repmgr 
user=repmgr

It is also showing correctly in repl_nodes table of the two databases.

Why is it hanging?? Thank you for your help...

Here is the complete output:
-----------------------------------------------
-bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf -C /etc/repmgr/9.5/repmgr.conf 
standby switchover -L DEBUG -v
NOTICE: using configuration file "/etc/repmgr/9.5/repmgr.conf"
NOTICE: switching current node 2 to master server and demoting current master 
to standby...
DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
INFO: retrieving node list for cluster 'repmgr_cluster'
DEBUG: get_master_connection():
  SELECT id, conninfo,          CASE WHEN type = 'master' THEN 1 ELSE 2 END AS 
type_priority    FROM "repmgr_repmgr_cluster".repl_nodes    WHERE cluster = 
'repmgr_cluster'      AND type != 'witness' ORDER BY active DESC, 
type_priority, priority, id
INFO: checking role of cluster node '1'
DEBUG: connecting to: 'host=10.70.3.251 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: get_master_connection(): current master node is 1
DEBUG: get_node_record():
SELECT id, type, upstream_node_id, name, conninfo,        slot_name, priority, 
active  FROM "repmgr_repmgr_cluster".repl_nodes  WHERE cluster = 
'repmgr_cluster'    AND id = 1
DEBUG: remote node name is "itupl-postgen1"
DEBUG: test_ssh_connection(): executing ssh -o Batchmode=yes  10.70.3.251 
/bin/true 2>/dev/null
DEBUG: get_pg_setting(): SELECT name, setting   FROM pg_catalog.pg_settings 
WHERE name = 'data_directory'
DEBUG: get_pg_setting(): returned value is "/var/lib/pgsql/9.5/data"
DEBUG: master's data directory is: /var/lib/pgsql/9.5/data
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 ls 
'/var/lib/pgsql/9.5/data/PG_VERSION' >/dev/null 2>&1 && echo 1 || echo 0
DEBUG: remote_command(): output returned was:
1
DEBUG: PG_VERSION found in /var/lib/pgsql/9.5/data
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 ls 
'/usr/pgsql-9.5/bin/pg_rewind' >/dev/null 2>&1 && echo 1 || echo 0
DEBUG: remote_command(): output returned was:
1
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings  WHERE name = 'full_page_writes' AND 
setting = 'off'
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings  WHERE name = 'wal_log_hints' AND 
setting = 'on'
INFO: looking for file "/etc/repmgr/9.5/repmgr.conf" on remote server 
"10.70.3.251"
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 ls 
'/etc/repmgr/9.5/repmgr.conf' >/dev/null 2>&1 && echo 1 || echo 0
DEBUG: remote_command(): output returned was:
1
INFO: remote configuration file "/etc/repmgr/9.5/repmgr.conf" found on remote 
server
DEBUG: remote_archive_config_dir: /tmp/repmgr-itupl-postgen1-archive
DEBUG: Executing:
/usr/pgsql-9.5/bin/repmgr standby archive-config -f 
'/etc/repmgr/9.5/repmgr.conf' 
--config-archive-dir='/tmp/repmgr-itupl-postgen1-archive'
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 
/usr/pgsql-9.5/bin/repmgr standby archive-config -f 
'/etc/repmgr/9.5/repmgr.conf' 
--config-archive-dir='/tmp/repmgr-itupl-postgen1-archive'

WARNING:  nonstandard use of escape in a string literal
LINE 1: ...config_file,          regexp_replace(config_file, '^.*\/',''...
                                                             ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
NOTICE: 3 files copied to /tmp/repmgr-itupl-postgen1-archive
DEBUG: remote_command(): output returned was:
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 
/usr/pgsql-9.5/bin/pg_ctl -D '/var/lib/pgsql/9.5/data' -m fast -W stop 
>/dev/null 2>&1 && echo 1 || echo 0
DEBUG: remote_command(): output returned was:
1
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 ls 
'/var/lib/pgsql/9.5/data/postmaster.pid' >/dev/null 2>&1 && echo 1 || echo 0
DEBUG: remote_command(): output returned was:
0
NOTICE: current master has been stopped
INFO: connecting to standby database
DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
INFO: connected to standby, checking its state
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
INFO: retrieving node list for cluster 'repmgr_cluster'
DEBUG: get_master_connection():
  SELECT id, conninfo,          CASE WHEN type = 'master' THEN 1 ELSE 2 END AS 
type_priority    FROM "repmgr_repmgr_cluster".repl_nodes    WHERE cluster = 
'repmgr_cluster'      AND type != 'witness' ORDER BY active DESC, 
type_priority, priority, id
INFO: checking role of cluster node '1'
DEBUG: connecting to: 'host=10.70.3.251 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
ERROR: connection to database failed: could not connect to server: Connection 
refused
        Is the server running on host "10.70.3.251" and accepting
        TCP/IP connections on port 5432?

INFO: checking role of cluster node '2'
DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
NOTICE: promoting standby
DEBUG: get_pg_setting(): SELECT name, setting   FROM pg_catalog.pg_settings 
WHERE name = 'data_directory'
DEBUG: get_pg_setting(): returned value is "/var/lib/pgsql/9.5/data"
NOTICE: promoting server using '/usr/pgsql-9.5/bin/pg_ctl -D 
/var/lib/pgsql/9.5/data promote'
server promoting
INFO: reconnecting to promoted server
DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr 
fallback_application_name='repmgr''
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: setting node 2 as master and marking existing master as failed
DEBUG: begin_transaction()
DEBUG: commit_transaction()
NOTICE: STANDBY PROMOTE successful
DEBUG: create_event_record():
INSERT INTO "repmgr_repmgr_cluster".repl_events (              node_id,         
     event,              successful,              details             )       
VALUES ($1, $2, $3, $4)    RETURNING event_timestamp
DEBUG: create_event_record(): Event timestamp is "2017-05-22 
16:56:06.860066+09:30"
NOTICE: Executing pg_rewind on old master server
DEBUG: pg_rewind command is:
'/usr/pgsql-9.5/bin/pg_rewind' -D '/var/lib/pgsql/9.5/data' 
--source-server=\'host=10.70.3.252 dbname=repmgr user=repmgr\'
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 
'/usr/pgsql-9.5/bin/pg_rewind' -D '/var/lib/pgsql/9.5/data' 
--source-server=\'host=10.70.3.252 dbname=repmgr user=repmgr\'

DEBUG: remote_command(): output returned was:
servers diverged at WAL position 1/1D000098 on timeline 11
no rewind required
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 
/usr/pgsql-9.5/bin/repmgr standby restore-config -D '/var/lib/pgsql/9.5/data'  
--config-archive-dir='/tmp/repmgr-itupl-postgen1-archive'

ERROR: unable to determine cluster name - please provide a valid configuration 
file with -c/--config-file
HINT: Use -F/--force to continue anyway
DEBUG: remote_command(): output returned was:
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 test -e 
'/var/lib/pgsql/9.5/data/recovery.done' && rm -f 
'/var/lib/pgsql/9.5/data/recovery.done'

DEBUG: remote_command(): output returned was:
DEBUG: Executing:
/usr/pgsql-9.5/bin/repmgr -D '/var/lib/pgsql/9.5/data' -f 
'/etc/repmgr/9.5/repmgr.conf' -h 10.70.3.252 -d repmgr -U repmgr  standby follow
DEBUG: remote_command(): ssh -o Batchmode=yes  10.70.3.251 
/usr/pgsql-9.5/bin/repmgr -D '/var/lib/pgsql/9.5/data' -f 
'/etc/repmgr/9.5/repmgr.conf' -h 10.70.3.252 -d repmgr -U repmgr  standby follow

NOTICE: restarting server using '/usr/pgsql-9.5/bin/pg_ctl  -w -D 
/var/lib/pgsql/9.5/data -m fast restart'
pg_ctl: PID file "/var/lib/pgsql/9.5/data/postmaster.pid" does not exist
Is server running?
starting server anyway


Regards
Dylan

Reply via email to