Re: Trying to handle db corruption 9.6

Mariel Cherkassky Mon, 20 May 2019 10:21:14 -0700

Hey Greg,
Basically my backup was made after the first pg_resetxlog so I was wrong.
However, the customer had a secondary machine that wasn't synced for a
month. I have all the walls since the moment the secondary went out of
sync. Once I started it I hoped that it will start recover the wals and
fill the gap. However I got an error in the secondary :
 2019-05-20 10:11:28 PDT  19021  LOG:  entering standby mode
2019-05-20 10:11:28 PDT  19021  LOG:  invalid primary checkpoint record
2019-05-20 10:11:28 PDT  19021  LOG:  invalid secondary checkpoint link in
control file
2019-05-20 10:11:28 PDT  19021  PANIC:  could not locate a valid checkpoint
record
2019-05-20 10:11:28 PDT  19018  LOG:  startup process (PID 19021) was
terminated by signal 6: Aborted
2019-05-20 10:11:28 PDT  19018  LOG:  aborting startup due to startup
process failure
2019-05-20 10:11:28 PDT  19018  LOG:  database system is shut down.
                    I checked my secondary archive dir and pg_xlog dir and
it seems that the restore command doesnt work. My restore_command:
restore_command = 'rsync -avzhe ssh [email protected]:/var/lib/pgsql/archive/%f
/var/lib/pgsql/archive/%f ; gunzip < /var/lib/pgsql/archive/%f > %p'
archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
/var/lib/pgsql/archive %r'



On Mon, May 20, 2019, 7:20 PM Bimal <[email protected]> wrote:

> I had ran into same issue about year back, luckily I had standby to
> quickly promote.  But, I wish there was better a documentation on how to
> handle WAL log fill up and resetting them.
>
> On Monday, May 20, 2019, 9:08:19 AM PDT, Mariel Cherkassky <
> [email protected]> wrote:
>
>
> A backup was made after the corruption appeared but before I tried using
> the pg_resetxlog command. Basically I just want to start the database with
> the data that is available in the files(I'm ok with loosing data that was
> in the cache and wasnt written to disk).
> My question is how can I continue from here ?
> I also sent this mail to pgadmin mail list..
>
> ‫בתאריך יום ב׳, 20 במאי 2019 ב-18:59 מאת ‪Greg Clough‬‏ <‪
> [email protected]‬‏>:‬
>
> > Yes I understand that.. I'm trying to handle it after the backup that I
> have taken..
>
>
> IMHO the best option here is to keep safe a copy as you have already done
> and then restore from a backup, and replay whatever WAL you have.  The
> database you have is terminally corrupted, and should never be relied upon
> going forward.
>
>
>
> You can try to get it running, and then extract the data with pg_dump...
> but even then you will need to manually verify it’s OK because you have no
> idea which dirty blocks from memory have been written to disk and which
> have not.  Without the WAL you have no way of making it consistent, and if
> they have been destroyed then you’re out of luck.
>
>
>
> If you don’t have backups and archived WAL then fixing what you’ve got may
> be your only option, but you should only go down that route if you have
> to.  If you have to “repair”, then I’d recommend engaging a reputable
> PostgreSQL consultancy to help you.
>
>
>
> Regards,
>
> Greg.
>
> P.S.  This conversation should probably be moved to something like
> pgsql-admin
>
>
>
> ------------------------------
>
> This e-mail, including accompanying communications and attachments, is
> strictly confidential and only for the intended recipient. Any retention,
> use or disclosure not expressly authorised by IHSMarkit is prohibited. This
> email is subject to all waivers and other terms at the following link:
> https://ihsmarkit.com/Legal/EmailDisclaimer.html
>
> Please visit www.ihsmarkit.com/about/contact-us.html for contact
> information on our offices worldwide.
>
>

Re: Trying to handle db corruption 9.6

Reply via email to