-----Original Message----- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Fabio Milano Sent: Thursday, July 24, 2014 12:40 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Standby Server Bus 7 error
-----Original Message----- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Wednesday, July 23, 2014 10:25 PM To: Fabio Milano Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Standby Server Bus 7 error On Thu, Jul 24, 2014 at 12:53 PM, Fabio Milano <fmil...@crdsgroup.com> wrote: > Any assistance in interpreting the logs is much appreciated. > Replication server crashes. Below is snippet from log. > 2014-07-22 23:36:23 EDT LOG: started streaming WAL from pr imary at > 12/B0000000 on timeline 1 > 2014-07-22 23:43:12 EDT FATAL: could not receive data from WAL > stream: server closed the connection unexpectedly > This probably means the server terminated a bnormally > before or while processing the request. The replication stream has finished. Did you root server crash in some way? > cp: cannot stat '/var/lib/postgresql/archive/00000001000000 > 12000000D0': No such file or directory > 2014-07-22 23:43:12 EDT LOG: unexpected pageaddr 12/8D0000 > 00 in log segment 0000000100000012000000D0, offset 0 This means that server expected new WAL, but found instead an old record, something that happens when a WAL file is recycled and is not yet overwritten with new data. Did your archives get corrupted in some way by a crash of master? > 2014-07-22 23:43:28 EDT LOG: restored log file "0000000100 > 000012000000D0" from archive > cp: cannot stat '/var/lib/postgresql/archive/00000001000000 > 12000000D1': No such file or directory > 2014-07-22 23:43:28 EDT LOG: unexpected pageaddr 12/A50000 > 00 in log segment 0000000100000012000000D1, offset 0 Same here. > Upon trying a /etc/init.d/postgresql restart > 2014-07-23 14:47:55 EDT LOG: restored log file "0000000 00001B0000001D" > from archive > 2014-07-23 14:47:55 EDT LOG: consistent recovery state ched at > 1B/1DFC64C0 > 2014-07-23 14:47:55 EDT LOG: database system is ready t ccept read only > connections > 2014-07-23 14:47:55 EDT LOG: restored log file "0000000 00001B0000001E" > from archive > 2014-07-23 14:47:56 EDT LOG: startup process (PID 730) terminated by > signal 7: Bus error This may find its root to a hardware problem... What is the version of 9.3 you are using btw? -- Michael Thank you Michael, No report of crashing on the Master. >> This means that server expected new WAL, but found instead an old record, >> something that happens when a WAL file is recycled and is not yet >> overwritten with new data. Could this be caused by something happening by the developers on the Master? The Archive command on the master is: 'rsync -az %p postgres@[ipaddress]:archive/%f' >> This may find its root to a hardware problem... The standby server is a VPS. Previously 9.0.13 replication was working . 9.0.13 was removed. OS upgraded to Ubuntu 14.04 and Postgres 9.3.4 installed Master new install OS Ubuntu 14.04 and new postgres 9.3.4 install Version is: psql (PostgreSQL) 9.3.4 Performed a manual resync yesterday however another crash this morning. Standby - /etc/init.d/postgresql stop Master - psql -c "select pg_start_backup('label', true)" Master - rsync -avc --exclude postmaster.pid --exclude postgresql.pid /var/lib/postgresql/9.3/main/* [ip address]:/var/lib/postgresql/9.3/main/ Master - psql -c "SELECT pg_stop_backup()" Standby - /etc/init.d/postgresql start Another crash. 2014-07-24 09:58:31 EDT LOG: restored log file "000000010000003300000096" from archive cp: cannot stat '/var/lib/postgresql/archive/000000010000003300000097': No such file or directory 2014-07-24 09:58:31 EDT LOG: unexpected pageaddr 33/5C000000 in log segment 000000010000003300000097, offset 0 2014-07-24 09:58:32 EDT LOG: started streaming WAL from primary at 33/97000000 on timeline 1 2014-07-24 10:03:50 EDT FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. cp: cannot stat '/var/lib/postgresql/archive/0000000100000033000000B7': No such file or directory 2014-07-24 10:03:50 EDT LOG: unexpected pageaddr 33/6A000000 in log segment 0000000100000033000000B7, offset 0 2014-07-24 10:04:06 EDT LOG: restored log file "0000000100000033000000B7" from archive cp: cannot stat '/var/lib/postgresql/archive/0000000100000033000000B8': No such file or directory 2014-07-24 10:04:06 EDT LOG: unexpected pageaddr 33/76000000 in log segment 0000000100000033000000B8, offset 0 2014-07-24 10:04:06 EDT LOG: started streaming WAL from primary at 33/B8000000 on timeline 1 2014-07-24 10:06:24 EDT LOG: startup process (PID 18905) was terminated by signal 7: Bus error 2014-07-24 10:06:24 EDT LOG: terminating any other active server processes /etc/init.d/postgresql restart 2014-07-24 15:19:55 EDT LOG: restored log file "000000010000003400000009" from archive 2014-07-24 15:19:57 EDT LOG: startup process (PID 22458) was terminated by signal 7: Bus error 2014-07-24 15:19:57 EDT LOG: terminating any other active server processes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general >> Confirmed with Developers: >> We haven't done any rollbacks, and haven't seen any crashes. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general