The instance is still running, I tried to collect more information from it:
all databases are working as expected, the only issue is that monitoring SQL commands (pg_stat_activity, pg_stat_replication) are not working as expected (do not reflect postgres processes list from command-line) on Master: - pg_stat_activity is empty as well (they can be seen just in ps f -fu postgres output: CTSYSTEM lines) - psql as postgres: select * from pg_stat_activity sees only its own session - psql as unprivileged user (CTSYSTEM): select * from pg_stat_activity is empty - replication works fine (created a table, that was created also on all replicas) - added lines to postgresql.conf + reload: Opening new lines to postgresql.conf + reload configuration: client_min_messages = debug5 log_min_messages = debug5 log_min_error_statement = debug5 - activity seen in pg_log, also replication activity (pgreplic user) is seen, still nothing in pg_stat_replication/pg_stat_activity killed one slave postgres instance, restarted it - "standby "l2abrnch" has now caught up with primary" - replication works fine - no entries on Master in pg_stat_replication - ps -ef shows the new wal-sender process on master and wal-receiver process streaming on this slave Version is: PostgreSQL 9.3.10 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit I suspect something happened within master server (pg_stat_activity+pg_stat_replication not working as described, do not reflect ps -ef list of postgres processes and running SQL client/replication information) What may be additionally useful information before restarting the master? Regards, Andrej 2016-05-25 23:22 GMT+02:00 Andrej Vanek <andrej.vanek...@gmail.com>: > Streaming replication set-up, > > one master, 3 slaves connecting to it. > I expected ps -ef gets all wal-sender processes and SAME information I'll > get via select * from pg_stat_replication. > Instead I observed: > - pg_stat_replication is empty > - 3 wal-sender processes up and running > - each slave has wal-receiver process running > - replication works (tried to create a table- it appears in all databases) > Question: > - why is pg_stat_replication empty? > > Andrej > ---------------details > [root@l2bmain ~]# tail /opt/pg_data/postgresql.conf > max_wal_senders = 5 > hot_standby = on > wal_keep_segments = 128 > archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f' > wal_receiver_status_interval = 2 > max_standby_streaming_delay = -1 > max_standby_archive_delay = -1 > restart_after_crash = off > hot_standby_feedback = on > wal_sender_timeout = 1min > [root@l2bmain ~]# ps f -fu postgres > UID PID PPID C STIME TTY STAT TIME CMD > postgres 10797 1 0 15:53 ? S 0:20 > /usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c > config_file=/opt/pg_data//postgresql.conf > postgres 10820 10797 0 15:53 ? Ss 0:00 \_ postgres: logger > process > postgres 10823 10797 0 15:53 ? Ss 0:00 \_ postgres: > checkpointer process > postgres 10824 10797 0 15:53 ? Ss 0:00 \_ postgres: writer > process > postgres 10825 10797 0 15:53 ? Ss 0:00 \_ postgres: wal > writer process > postgres 10826 10797 0 15:53 ? Ss 0:01 \_ postgres: > autovacuum launcher process > postgres 10827 10797 0 15:53 ? Ss 0:00 \_ postgres: archiver > process last was 0000000100000000000000A3.00000028.backup > postgres 10828 10797 0 15:53 ? Ss 0:03 \_ postgres: stats > collector process > postgres 11286 10797 0 15:54 ? Ss 0:08 \_ postgres: wal > sender process pgreplic 192.168.204.12(55231) streaming 0/A401BED8 > postgres 11287 10797 0 15:54 ? Ss 0:06 \_ postgres: wal > sender process pgreplic 192.168.204.11(42937) streaming 0/A401BED8 > postgres 19322 10797 0 15:58 ? Ss 0:08 \_ postgres: wal > sender process pgreplic 192.168.101.11(52379) streaming 0/A401BED8 > postgres 28704 10797 0 18:44 ? Ss 0:00 \_ postgres: CTSYSTEM > lidb 192.168.102.13(58245) idle > postgres 7256 10797 0 18:52 ? Ss 0:00 \_ postgres: CTSYSTEM > lidb 192.168.102.23(55190) idle > postgres 8667 10797 0 18:53 ? Ss 0:00 \_ postgres: CTSYSTEM > lidb 192.168.102.13(58287) idle > [root@l2bmain ~]# psql -U postgres -c "select * from pg_stat_replication;" > pid | usesysid | usename | application_name | client_addr | > client_hostname | client_port | backend_start | state | sent_location | > write_location | flush_location | r > eplay_location | sync_priority | sync_state > > -----+----------+---------+------------------+-------------+-----------------+-------------+---------------+-------+---------------+----------------+----------------+-- > ---------------+---------------+------------ > (0 rows) > > [root@l2bmain ~]# tail /opt/pg_data/pg_log/postgresql-Wed.log > 2016-05-25 15:53:56 CEST:@:[8603] LOG: database system is shut down > 2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was shut down in > recovery at 2016-05-25 15:53:56 CEST > 2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was not properly > shut down; automatic recovery in progress > 2016-05-25 15:53:58 CEST:@:[10821] LOG: consistent recovery state > reached at 0/A2000090 > 2016-05-25 15:53:58 CEST:@:[10821] LOG: record with zero length at > 0/A2000090 > 2016-05-25 15:53:58 CEST:@:[10821] LOG: redo is not required > 2016-05-25 15:53:58 CEST:@:[10821] LOG: MultiXact member wraparound > protections are now enabled > 2016-05-25 15:53:58 CEST:@:[10797] LOG: database system is ready to > accept connections > 2016-05-25 15:53:58 CEST:@:[10826] LOG: autovacuum launcher started > ` > [root@l2bmain ~]# ssh 192.168.101.11 > Last login: Wed May 25 22:48:18 2016 from 192.168.101.12 > [root@l2amain ~]# ps f -fu postgres > UID PID PPID C STIME TTY STAT TIME CMD > postgres 5730 1 0 15:58 ? S 0:04 > /usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c > config_file=/opt/pg_data//postgresql.conf > postgres 5754 5730 0 15:58 ? Ss 0:00 \_ postgres: logger > process > postgres 5755 5730 0 15:58 ? Ss 0:00 \_ postgres: startup > process recovering 0000000100000000000000A4 > postgres 5773 5730 0 15:58 ? Ss 0:12 \_ postgres: wal > receiver process streaming 0/A401C030 > postgres 5774 5730 0 15:58 ? Ss 0:00 \_ postgres: > checkpointer process > postgres 5775 5730 0 15:58 ? Ss 0:00 \_ postgres: writer > process > postgres 5776 5730 0 15:58 ? Ss 0:00 \_ postgres: stats > collector process > [root@l2amain ~]# psql -U postgres -c "select pg_is_in_recovery();" > pg_is_in_recovery > ------------------- > t > (1 row) > [root@l2bmain ~]# ssh 192.168.204.11 > Warning: Permanently added '192.168.204.11' (RSA) to the list of known > hosts. > Last login: Wed May 25 16:28:49 2016 from 192.168.200.254 > [root@l2abrnch ~]# ps f -fu postgres > UID PID PPID C STIME TTY STAT TIME CMD > postgres 8174 1 0 15:48 ? S 0:04 > /usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c > config_file=/opt/geo_stdby_data//postgresql.conf > postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger > process > postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup > process recovering 0000000100000000000000A4 > postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres: > checkpointer process > postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer > process > postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats > collector process > postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal > receiver process streaming 0/A401C0D0 > > [root@l2bmain ~]# ssh 192.168.204.11 > Warning: Permanently added '192.168.204.11' (RSA) to the list of known > hosts. > Last login: Wed May 25 16:28:49 2016 from 192.168.200.254 > [root@l2abrnch ~]# ps f -fu postgres > UID PID PPID C STIME TTY STAT TIME CMD > postgres 8174 1 0 15:48 ? S 0:04 > /usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c > config_file=/opt/geo_stdby_data//postgresql.conf > postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger > process > postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup > process recovering 0000000100000000000000A4 > postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres: > checkpointer process > postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer > process > postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats > collector process > postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal > receiver process streaming 0/A401C0D0 > [root@l2abrnch ~]# psql -U postgres -c "select pg_is_in_recovery();" > pg_is_in_recovery > ------------------- > t > (1 row) > > [root@l2abrnch ~]# logout > Connection to 192.168.204.11 closed. > [root@l2bmain ~]# ssh 192.168.204.12 > Warning: Permanently added '192.168.204.12' (RSA) to the list of known > hosts. > Last login: Wed May 25 14:58:03 2016 from 192.168.200.254 > [root@l2bbrnch ~]# ps f -fu postgres > UID PID PPID C STIME TTY STAT TIME CMD > postgres 22885 1 0 15:48 ? S 0:00 > /usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c > config_file=/opt/geo_stdby_data//postgresql.conf > postgres 22913 22885 0 15:48 ? Ss 0:00 \_ postgres: logger > process > postgres 22914 22885 0 15:48 ? Ss 0:00 \_ postgres: startup > process recovering 0000000100000000000000A4 > postgres 22917 22885 0 15:48 ? Ss 0:00 \_ postgres: > checkpointer process > postgres 22918 22885 0 15:48 ? Ss 0:00 \_ postgres: writer > process > postgres 22919 22885 0 15:48 ? Ss 0:00 \_ postgres: stats > collector process > postgres 26163 22885 0 15:54 ? Ss 0:13 \_ postgres: wal > receiver process streaming 0/A401C0D0 > >