2016-11-29 15:21 GMT+13:00 David Steele <da...@pgmasters.net>: > On 11/24/16 8:05 PM, Patrick B wrote: > > > hmm.. I really don't get it. > > > > > > > > If I get messages like: > > > > *cp: cannot stat '/walfiles/0000000200001AF8000000A5': No such file or > > director*y > > > > In my head, it's saying that it was unable to recover that file and, > > because of that, there is missing data. > > Even if the server is able to connect to tbe master via streaming > > replication, there might be missing data. There might be data that is > > into master but not into slave. > > > > Am I wrong? If so, I don't understand why are the wal_files for, then. > > This is normal as far as it goes. Postgres keeps checking for the WAL > file that it needs to become consistent and cp keeps throwing errors > when it can't find the file (yet). > > This simply demonstrates how unsuitable cp is as a recovery command. I > recommend you look at doing backup/archive with a professional tool such > as pgBackRest (http://www.pgbackrest.org) or Barman > (http://www.pgbarman.org/). > > > Thanks for all the replies guys.
Related to the "cp" - That's fine. I'm just doing this in a test environment. On my prod servers I use wal-e(aws s3) and bash script(locally) to do this work. I've done some tests, that only gave me more questions. I'll put everything I done here and hopefully you guys will be able to help me to understand it. 1. Set up a pg-test-01 machine on Ubuntu 16 Server 2. installed Postgres 9.2 + contribs + libs + etc 3. Configured postgres to be a streaming replication slave (postgresql.conf and recovery.conf) 4. Configured my masterdb01 server to ship wal_files to the pg-test-01 server 5. Made a pg_basebackup on my master DB server (DB is 1TB big) 6. Recovered the basebackup on the pg-test-01 machine 7. Postgres started to recover the wal_files and then streaming replication worked: LOG: streaming replication successfully connected to primary 8. Stopped postgres on the pg-test-01 server for 10 minutes 9. Started postgres on the pg-test-01 to see if the wal_files would be recovered LOG: restored log file "0000000200001B2F0000004F" from archive LOG: restored log file "0000000200001B2F00000050" from archive LOG: restored log file "0000000200001B2F00000051" from archive LOG: restored log file "0000000200001B2F00000052" from archive 10. Removed all the wal_files from pg-test-01 (No wal_files are being shipped to this server anymore) 11. Stopped postgres for 2 hours 12. Started postgres as streaming replication cp: cannot stat 'archive/0000000200002B36000000BB': No such file or > directory > cp: cannot stat 'archive/0000000200002B36000000BC': No such file or > directory > cp: cannot stat 'archive/0000000200002B36000000BD': No such file or > directory > LOG: streaming replication successfully connected to primary Postgres was stopped for 2h. How can postgres connect using streaming replication, if no wal_files were in there??? Hos is that even possible?? I don't understand! Thanks Patrick