On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: >On 12/11/2014 05:45 AM, Andres Freund wrote: >> A customer recently reported getting "backup_label contains data >> inconsistent with control file" after taking a basebackup from a >standby >> and starting it with a typo in primary_conninfo. >> >> When starting postgres from a basebackup StartupXLOG() has the follow >> code to deal with backup labels: >> if (haveBackupLabel) >> { >> ControlFile->backupStartPoint = checkPoint.redo; >> ControlFile->backupEndRequired = backupEndRequired; >> >> if (backupFromStandby) >> { >> if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY) >> ereport(FATAL, >> (errmsg("backup_label contains data >inconsistent with control file"), >> errhint("This means that the backup is >corrupted and you will " >> "have to use another backup for >recovery."))); >> ControlFile->backupEndPoint = >ControlFile->minRecoveryPoint; >> } >> } >> >> while I'm not enthusiastic about the error message, that bit of code >> looks sane at first glance. We certainly expect the control file to >> indicate we're in recovery. Since we're unlinking the backup label >> shortly afterwards we'd normally not expect to hit that case after a >> shutdown in recovery. > >Check. > >> The problem is that after reading the backup label we also have to >read >> the corresponding checkpoing from pg_xlog. If primary_conninfo and/or >> restore_command are misconfigured and can't restore files that can >only >> be fixed by shutting down the cluster and fixing up recovery.conf - >> which sets DB_SHUTDOWNED_IN_RECOVERY in the control file. > >No it doesn't. The state is set to DB_SHUTDOWNED_IN_RECOVERY in >CreateRestartPoint(). If you shut down the server before it has even >read the initial checkpoint record, it will not attempt to create a >restartpoint nor update the control file.
Yes, it does. There's a shortcut that just sets the state in the control file and then exits. >> The easiest solution seems to be to simply also allow that as a state >in >> the above check. It might be nicer to not allow a ShutdownXLOG to >modify >> the control file et al at that stage, but I think that'd end up being >> more invasive. >> >> A short search shows that that also looks like a credible explanation >> for #12128... > >Yeah. I was not able to reproduce this, but I'm clearly missing >something, since both you and Sergey have seen this happening. Can you >write a script to reproduce? Not right now, I only have my mobile... Its quite easy though. Create a pg-basebackup from a standby. Create a recovery.conf with a broken primary conninfo. Start. Shutdown. Fix conninfo. Start. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers