I'm in the single-slave scenario, with hot standby capabilities, meaning I want 
to run queries on the slave. I'm running some tests to evaluate pgbarman, on 
Ubuntu 11.10. I used only packaged PostgreSQL, and I'm running version 
"PostgreSQL 9.1.5 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real 
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit". Both the master and the slave 
are running on the same host.

master/postgresql.conf

port = 5432
archive_mode = on
wal_level = hot_standby
max_wal_senders = 3
wal_keep_segments = 256
archive_command = '/bin/cp --verbose %p /var/pgexchange/%f'

master/pg_hba.conf (as I said, testing config only):

host    replication     postgres        127.0.0.1/32            trust

slave/postgrseql.conf:
port = 5433
hot_standby = on
hot_standby_feedback = on
max_standby_archive_delay = -1
max_standby_streaming_delay = -1

slave/pg_hba.conf -- all at default

/var/lib/postgresql/9.1/slave0/recovery.conf:

standby_mode = on
restore_command = '/bin/cp --verbose /var/pgexchange/%f %p' 
primary_conninfo = 'host=localhost port=5432 user=postgres 
password=supersecretpassword'


The slave's log says it's connected to the master, but I can't connect.

# psql -h localhost -p 5433 -U postgres -d mydb
psql: FATAL:  the database system is starting up
FATAL:  the database system is starting up

The slave's log, after a fresh pg_basebackup + restore for the slave, contains:

==> /var/log/postgresql/postgresql-9.1-slave0.log <==
2012-09-25 00:46:22 UTC LOG:  database system was interrupted; last known up at 
2012-09-25 00:44:20 UTC
2012-09-25 00:46:22 UTC LOG:  creating missing WAL directory 
"pg_xlog/archive_status"
2012-09-25 00:46:22 UTC LOG:  entering standby mode
`/var/pgexchange/000000010000000000000016' -> `pg_xlog/RECOVERYXLOG'
2012-09-25 00:46:22 UTC LOG:  restored log file "000000010000000000000016" from 
archive
2012-09-25 00:46:23 UTC LOG:  redo starts at 0/16000020
2012-09-25 00:46:23 UTC LOG:  consistent recovery state reached at 0/17000000
/bin/cp: cannot stat `/var/pgexchange/000000010000000000000017': No such file 
or directory
2012-09-25 00:46:23 UTC LOG:  incomplete startup packet
2012-09-25 00:46:23 UTC LOG:  streaming replication successfully connected to 
primary
2012-09-25 00:46:23 UTC FATAL:  the database system is starting up
2012-09-25 00:46:24 UTC FATAL:  the database system is starting up
2012-09-25 00:46:24 UTC FATAL:  the database system is starting up


The "system is starting up" are the result of the pg_ctlcluster script which 
attempts to connect to the database to check if the server's up and available. 
According to the log above, a consistent state is reached, and the slave 
connects to the primary. During the slave's reconnection, the master emits no 
messages.

On the master, pg_stat_replication looks fine:

# select * from pg_stat_replication ;
 procpid | usesysid | usename  | application_name | client_addr | 
client_hostname | client_port |         backend_start         |   state   | 
sent_location | write_location | flush_location | replay_location | 
sync_priority | sync_state 
---------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
   27920 |       10 | postgres | walreceiver      | 127.0.0.1   |               
  |       52193 | 2012-09-25 00:46:23.100631+00 | streaming | 0/17000000    | 
0/17000000     | 0/17000000     | 0/17000000      |             0 | async

state == streaming; sent == write == flush == replay, so the slave seems to be 
consistent.

What am I missing here?

Thanks!
François

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to