[GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

Mike Broers Thu, 21 Nov 2013 13:08:52 -0800

Update - I have two hot replication slaves of this db, both have the
problem.  I took one out of recovery and ran REINDEX table session_session
and it fixed the errors about this row.  Now Im going to run vacuum and see
if there are other tables that complain, but Im guessing if so I will need
to see if there is a way to force vacuum to continue on error, worst case I
might have to script a table by table vacuum script I guess..  If anyone
has a better suggestion for determining the extent of the damage Id
appreciate it.



On Thu, Nov 21, 2013 at 2:10 PM, Mike Broers <mbro...@gmail.com> wrote:

> Hello we are running postgres 9.2.5 on RHEL6, our production server
> crashed hard and when it came back up our logs were flooded with:
>
> STATEMENT:  SELECT "session_session"."session_key",
> "session_session"."session_data", "session_session"."expire_date",
> "session_session"."nonce" FROM "session_session" WHERE
> ("session_session"."session_key" = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'  AND
> "session_session"."expire_date" > '2013-11-21 13:27:33.107913' )
>
> ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
> in pg_toast_19122
>
> We restarted the application and whatever session was constantly hitting
> that row stopped, but Im concerned about remediation.  When I attempt to
> read from that row the error occurs.
>
> select * from session_session where session_key =
> 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
> ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
> in pg_toast_19122
>
> When I attempt to delete this row I get this error:
> delete from session_session where session_key =
> 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
> ERROR:  tuple concurrently updated
>
> We happen to have a maintenance window tonight so I will have some time
> when the app is down to run some database fixes.  I saw other threads
> suggesting a reindex of the toast table, but this is a 14GB table and I'm
> not sure how long that will take or if it will even be successful.  We also
> have a full db vacuum/analyze scheduled nightly for 2am so I am expecting
> to learn if there are other impacted tables, but its troubling if I dont
> know what the remediation.  This particular table could be truncated if
> necessary if that is an option but Im not sure about other tables.
>
> Any suggestions for how to handle the tuple concurrently updated error? Or
> if a reindex is likely to help with the unexpected chunk error?
>
> Thanks
> Mike
>

[GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

Reply via email to