[BUGS] WAL Receiver Segmentation Fault

2012-12-28 Thread Phil Sorber
Postgres 9.0.11 running as a hot standby.

The master was restarted and the standby went into a segmentation
fault loop. A hard stop/start fixed it. Here are pertinent logs with
excess and identifying information removed:

2012-12-28 03:39:14 UTC  [16850]: [2-1] FATAL:  replication terminated
by primary server
zcat: /mnt/dbmount/walarchive/00031A0100D5.gz: No such
file or directory
2012-12-28 03:39:14 UTC  [16801]: [21-1] LOG:  record with zero length
at 1A01/D578
zcat: /mnt/dbmount/walarchive/00031A0100D5.gz: No such
file or directory
2012-12-28 03:39:14 UTC  [16798]: [2-1] LOG:  WAL receiver process
(PID 16671) was terminated by signal 11: Segmentation fault
2012-12-28 03:39:14 UTC  [16798]: [3-1] LOG:  terminating any other
active server processes
2012-12-28 03:39:15 UTC  [16798]: [4-1] LOG:  all server processes
terminated; reinitializing
2012-12-28 03:39:15 UTC  [16673]: [1-1] LOG:  database system was
interrupted while in recovery at log time 2012-12-28 03:35:47 UTC
2012-12-28 03:39:15 UTC  [16673]: [2-1] HINT:  If this has occurred
more than once some data might be corrupted and you might need to
choose an earlier recovery target.
zcat: /mnt/dbmount/walarchive/0004.history.gz: No such file or directory
zcat: /mnt/dbmount/walarchive/0003.history.gz: No such file or directory
2012-12-28 03:39:16 UTC  [16673]: [3-1] LOG:  entering standby mode
zcat: /mnt/dbmount/walarchive/00031A010092.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A01007D.gz: No such
file or directory
2012-12-28 03:39:16 UTC  [16673]: [4-1] LOG:  redo starts at 1A01/7D00C500
zcat: /mnt/dbmount/walarchive/00031A01007E.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A01007F.gz: No such
file or directory
...
zcat: /mnt/dbmount/walarchive/00031A0100C0.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100C1.gz: No such
file or directory
2012-12-28 03:39:24 UTC  [16681]: [1-1] LOG:  restartpoint starting: xlog
zcat: /mnt/dbmount/walarchive/00031A0100C2.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100C3.gz: No such
file or directory
...
zcat: /mnt/dbmount/walarchive/00031A0100D3.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100D4.gz: No such
file or directory
2012-12-28 03:39:28 UTC  [16673]: [5-1] LOG:  consistent recovery
state reached at 1A01/D430F1A0
2012-12-28 03:39:28 UTC  [16798]: [5-1] LOG:  database system is ready
to accept read only connections
zcat: /mnt/dbmount/walarchive/00031A0100D5.gz: No such
file or directory
2012-12-28 03:39:28 UTC  [16673]: [6-1] LOG:  record with zero length
at 1A01/D578
zcat: /mnt/dbmount/walarchive/00031A0100D5.gz: No such
file or directory
2012-12-28 03:39:28 UTC  [16798]: [6-1] LOG:  WAL receiver process
(PID 16870) was terminated by signal 11: Segmentation fault
2012-12-28 03:39:28 UTC  [16798]: [7-1] LOG:  terminating any other
active server processes
2012-12-28 03:39:28 UTC  [16798]: [8-1] LOG:  all server processes
terminated; reinitializing
2012-12-28 03:39:30 UTC  [16871]: [1-1] LOG:  database system was
interrupted while in recovery at log time 2012-12-28 03:35:47 UTC
2012-12-28 03:39:30 UTC  [16871]: [2-1] HINT:  If this has occurred
more than once some data might be corrupted and you might need to
choose an earlier recovery target.
zcat: /mnt/dbmount/walarchive/0004.history.gz: No such file or directory
zcat: /mnt/dbmount/walarchive/0003.history.gz: No such file or directory
2012-12-28 03:39:30 UTC  [16871]: [3-1] LOG:  entering standby mode
zcat: /mnt/dbmount/walarchive/00031A010092.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A01007D.gz: No such
file or directory
2012-12-28 03:39:30 UTC  [16871]: [4-1] LOG:  redo starts at 1A01/7D00C500
zcat: /mnt/dbmount/walarchive/00031A01007E.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A01007F.gz: No such
file or directory
...
zcat: /mnt/dbmount/walarchive/00031A0100C0.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100C1.gz: No such
file or directory
2012-12-28 03:39:38 UTC  [16883]: [1-1] LOG:  restartpoint starting: xlog
zcat: /mnt/dbmount/walarchive/00031A0100C2.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100C3.gz: No such
file or directory
...
zcat: /mnt/dbmount/walarchive/00031A0100D3.gz: No such
file or directory
zcat: /mnt/dbmount/walarchive/00031A0100D4.gz: No such
file or directory
2012-12-28 03:39:41 UTC  [16871]: [5-1] LOG:  consistent recovery
state reached at 1A01/D430F1A0
2012-12-28 03:39:41 UTC  [16798]: [9-1] LOG:  database system is ready
to accept read only connections
zcat: /mnt/dbmount/walarchive/00031A0100D5.gz: No such
file or directory
2

Re: [BUGS] WAL Receiver Segmentation Fault

2012-12-28 Thread Heikki Linnakangas

On 28.12.2012 20:55, Phil Sorber wrote:

Postgres 9.0.11 running as a hot standby.

The master was restarted and the standby went into a segmentation
fault loop. A hard stop/start fixed it. Here are pertinent logs with
excess and identifying information removed:
...
If there is any more info I can provide, let me know. This is a
production DB so I won't be able to do any disruptive testing. Based
on what I have seen so far, I think this would be difficult to
replicate anyway.


A stack trace would be nice. If you didn't get a core dump this time, it 
would be good to configure the system so that you get one next time it 
happens.


- Heikki


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] WAL Receiver Segmentation Fault

2012-12-28 Thread Phil Sorber
On Fri, Dec 28, 2012 at 5:30 PM, Heikki Linnakangas
 wrote:
> A stack trace would be nice. If you didn't get a core dump this time, it
> would be good to configure the system so that you get one next time it
> happens.
>
> - Heikki

Sorry, no core. I will get it set up in case it happens again.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs