[GENERAL] Recurring corrupted page pointer panics on 9.4.4 hot-standby replica

Michael Robinson Mon, 26 Oct 2015 04:07:09 -0700

Hi,

Two days ago, we started getting panics on a hot-standby replica as follows:


2015-10-24 14:16:46.489 UTC PANIC:  corrupted page pointers: lower = 17,
> upper = 0, special = 8176
> 2015-10-24 14:16:46.490 UTC CONTEXT:  xlog redo unlink_page: rel
> 1663/16416/254063; dead 11796080; left 1365037; right 3024097; btpo_xact
> 64542957; leaf 2456241; leafleft 11130443; leafright 1350594; topparent
> 4294967295
> 2015-10-26 04:51:40.530 UTC PANIC:  corrupted page pointers: lower = 17,
> upper = 0, special = 8176
> 2015-10-26 04:51:40.530 UTC CONTEXT:  xlog redo unlink_page: rel
> 1663/16416/254063; dead 9922828; left 2449142; right 3415026; btpo_xact
> 64982371; leaf 2290440; leafleft 5120238; leafright 1903321; topparent
> 4294967295
> 2015-10-26 10:24:02.613 UTC PANIC:  corrupted page pointers: lower = 17,
> upper = 0, special = 8176
> 2015-10-26 10:24:02.613 UTC CONTEXT:  xlog redo unlink_page: rel
> 1663/16416/401628; dead 2348571; left 2348281; right 2351431; btpo_xact
> 65010718; leaf 2348740; leafleft 2348434; leafright 2351568; topparent
> 4294967295


The replica is running on a dedicated EC2 instance, and has been running
without any problems for several months.  The build version is
 9.4.4-1.pgdg14.04+1 from the apt repository, running on Ubuntu 14.04
Trusty.  The database is around 440GB, and is under constant moderate
read-only load (100-1000 queries per second).

There have been no issues with the master database, nor have there been any
database shutdowns other than the panics.

I would be very grateful for any insights as to what may have caused this,
and how best to recover stable operation.

Best regards,
Michael Robinson

[GENERAL] Recurring corrupted page pointer panics on 9.4.4 hot-standby replica

Reply via email to