Hi Tim, You have to enable hot_standby=on parameter on the replica server. See the below link. https://wiki.postgresql.org/wiki/Hot_Standby
Regards, Ibrahim. 2018-01-30 21:16 GMT+03:00 Tim Bowden <tim.bow...@mapforge.com.au>: > I've inherited a PG 9.4.4 install on Win 2008 that I'm wanting to > stream from (abt 80Gb on disk). Everything seems to be working from > the "master" side. > > I have an AMI of the master host (a vmware instance in our soon to > close data centre) that has been spun up in AWS (replicant). In it's > "default" configuration (ie, exactly the same as master) it worked > fine. > > I ran pg_basebackup on the master (from memory- details on work pc): > pg_basebackup -D "e:\\mybasebackup\\" -F t -R -X f -z -c fast \ > -h <ip_addr> -U <rep_user> > > Base backup is created fine. I've copied it up to AWS and extracted it > into the correct location (after ensuring it is empty) on the replicant > windows host (tar -xzvf base.tgz from a linux box with the pg data > drive mounted). File metadata is changed during the extraction (ie, > file timestamps- could this possibly be an issue?) but permissions on > the win2008 replicant host have been set & double checked. > > When I try to start the pg server on replicant, the log streaming from > master works (and keeps working till the server is rebooted), but the > server can't start for some reason. > > Copy of pg log file on replicant: > > 2018-01-30 17:18:02 AWST LOG: database system was shut down in > recovery at 2018-01-30 17:16:59 AWST > 2018-01-30 17:18:02 AWST LOG: entering standby mode > 2018-01-30 17:18:02 AWST LOG: redo starts at C26/84018A30 > 2018-01-30 17:18:02 AWST LOG: consistent recovery state reached at > C26/8401C6D0 > 2018-01-30 17:18:02 AWST LOG: invalid record length at C26/8401C6D0 > 2018-01-30 17:18:03 AWST LOG: started streaming WAL from primary at > C26/84000000 on timeline 1 > 2018-01-30 17:18:03 AWST FATAL: the database system is starting up > 2018-01-30 17:18:04 AWST FATAL: the database system is starting up > 2018-01-30 17:18:05 AWST FATAL: the database system is starting up > 2018-01-30 17:18:06 AWST FATAL: the database system is starting up > 2018-01-30 17:18:07 AWST FATAL: the database system is starting up > 2018-01-30 17:18:08 AWST FATAL: the database system is starting up > > The last line is repeated for about a minute till it gives up. It > leaves behind a bunch of processes that keep streaming log files as > master creates new wal records. This keeps going till the replicant > box is rebooted (easiest way to clean up the processes and free up port > 5432). > > I can keep rebooting the replicant host and on retrying to start the pg > server, it will apply any new wal files that have been streamed, then > fail to start the server but keep streaming again. > > The Windows service start command for PG: > > "C:\Program Files (x86)\PostgreSQL\9.4.4\bin\pg_ctl.exe" runservice -N > "postgresql-9.4" -D "E:\PostgreSQL\9.4.4\data" -w > > Config files: > > Master postgresql.conf: > > dynamic_shared_memory_type = windows > wal_level = hot_standby # Was "archive", but we'd like to do > hot... > archive_mode = on > archive_command = 'copy "%p" > "e:\\PostgreSQL\\9.4.4\\wal_archive\\%f"' # Windows > max_wal_senders = 8 > wal_keep_segments = 900 # Sick of losing old segments while I fix this > max_replication_slots = 8 > log_line_prefix = '%t ' > log_timezone = 'Australia/Perth' > datestyle = 'iso, dmy' > timezone = 'Australia/Perth' > lc_messages = 'English_Australia.1252' > lc_monetary = 'English_Australia.1252' > lc_numeric = 'English_Australia.1252' > lc_time = 'English_Australia.1252' > default_text_search_config = 'pg_catalog.english' > listen_addresses = '*' > port = 5432 > max_connections = 300 > shared_buffers = 500MB > work_mem = 32MB > maintenance_work_mem = 128MB > checkpoint_segments = 100 > random_page_cost = 2.0 > effective_cache_size = 1500MB > log_destination = 'stderr' > logging_collector = on > log_min_duration_statement = 500 > log_line_prefix = '%t ' > > Replicant postgresql.conf (essentially a copy from master with minimal > changes): > > dynamic_shared_memory_type = windows > wal_level = archive > max_standby_streaming_delay = 30s > wal_receiver_status_interval = 30s > log_line_prefix = '%t ' > log_timezone = 'Australia/Perth' > datestyle = 'iso, dmy' > timezone = 'Australia/Perth' > lc_messages = 'English_Australia.1252' > lc_monetary = 'English_Australia.1252' > lc_numeric = 'English_Australia.1252' > lc_time = 'English_Australia.1252' > default_text_search_config = 'pg_catalog.english' > listen_addresses = '*' > port = 5432 > max_connections = 300 > shared_buffers = 500MB > work_mem = 32MB > maintenance_work_mem = 128MB > checkpoint_segments = 100 > random_page_cost = 2.0 > effective_cache_size = 1500MB > log_destination = 'stderr' > logging_collector = on > log_min_duration_statement = 500 > log_line_prefix = '%t ' > > replicant recovery.conf: > > standby_mode = 'on' > primary_conninfo = 'host=<master_ip> user=<rep_user> > password=<don't_tell> connect_timeout=10 application_name=replicant_3' > primary_slot_name = 'replicant_3' > recovery_min_apply_delay = 10 > #restore_command = 'copy > "e:\\PostgreSQL\\9.4.4\\basebackup\\wal_archive\\%f" "%p" ' > trigger_file = 'recovery_done.txt' > > The restore_command was uncommented at first and worked fine. All new > updates now are provided by streaming. > > Any ideas? I'm tearing my hair out with this. The boxes are about as > identical as you can get, apart from the fact that one is running on > VMware and the other in AWS. > > Thanks, > Tim Bowden > >