Hi Andres,
thank you very much for your response.

Full details on our postgresql version is:
Name : postgresql-server
Arch : x86_64
Version : 9.2.13
Release : 1.el7_1
Size : 3.8 M
Repo : rhel-7-server-dvd-rpms
Summary : The programs needed to create and run a PostgreSQL server
URL : <http://www.postgresql.org/>
License : PostgreSQL
Description : PostgreSQL is an advanced Object-Relational database management 
system (DBMS).
: The postgresql-server package contains the programs needed to create
: and run a PostgreSQL server, which will in turn allow you to create
: and maintain PostgreSQL databases.

We generally promote a slave to master as a reaction to the master being 
determined dead by our healthchecker. This is not something that happens often 
"in the wild" but we have a substantial amount of automated and manually 
operated QA setups where these situations are provoked. In order to promote, we 
just touch the trigger file. Typically, our monitoring process will determine 
that the new master is ready, but that there is no slave available to stream 
to. At this point, the monitoring process turns off sync replication (by 
settings synchronous_standby_names to 'none') in order for the new master to be 
workable. At some point later, the old master becomes recycled and joins as a 
slave. At this point, the monitoring process will turn on sync replication 
again (by setting synchronous_standby_names to the ips of the master and slave )


In order to set up a slave (and to recycle an old master), we:

- stop postgresql

- copy a full base backup to a staging area, with the command:
sudo -u postgres pg_basebackup --no-password --host=$ip 
--label=Basebackup-$hostIP --username=root --pgdata=$stagingDir 
--xlog-method=stream

- wipe the postgresql data dir, with the command:
"find $pgData/* | grep -v server.crt | grep -v server.key | grep -v PG_VERSION 
| xargs rm -rf".

- move data from the staging dir to postgresql data dir

- remove a file called backup_label, but I am not certain that this file is in 
fact there (any more).

- remove the trigger file.

- generate the recovery.conf, with this content:
standby_mode = 'on'
primary_conninfo = 'host=" . $masterIP . " port=5432 user=$user 
password=$password application_name=" . $hostIP . " sslmode=require'
recovery_target_timeline = 'latest'
trigger_file = '/var/run/postgresql_trigger'

- start postgresql

- wait for it to become "streaming" in pg_stat_replication

... it should be noted that while this is going on, there are application 
servers that are working on the master database (some times relatively hard).


The log messages look something like the following (on the new slave):
Sep 20 14:57:13 ... LOG: database system was interrupted; last known up at 
2016-09-20 14:57:11 IST
...
Sep 20 15:00:27 ... LOG: entering standby mode
Sep 20 15:00:27 ... LOG: database system was not properly shut down; automatic 
recovery in progress
Sep 20 15:00:27 ... LOG: redo starts at 12/143FA3A8
Sep 20 15:00:27 ... LOG: 12 KnownAssignedXids (num=12 tail=0 head=12) [0]=38572 
[1]=38573 [2]=38574 [3]=38575 [4]=38576 [5]=38577 [6]=38578 [7]=38579 [8]=38580 
[9]=38581 [10]=38582 [11]=38583
Sep 20 15:00:27 ... CONTEXT: xlog redo Standby/RUNNING_XACTS: nextXid 38585 
latestCompletedXid 38571 oldestRunningXid 38572; 14 xacts: 38573 38575 38579 
38578 38574 38581 38580 38576 38577 38572 38582 38584 38583 38583
Sep 20 15:00:27 ... FATAL: out-of-order XID insertion in KnownAssignedXids
Sep 20 15:00:27 ... CONTEXT: xlog redo Standby/RUNNING_XACTS: nextXid 38585 
latestCompletedXid 38571 oldestRunningXid 38572; 14 xacts: 38573 38575 38579 
38578 38574 38581 38580 38576 38577 38572 38582 38584 38583 38583
Sep 20 15:00:27 ... LOG: startup process (PID 8053) exited with exit code 1
Sep 20 15:00:27 ... LOG: terminating any other active server processes

As you can see, it spends a few minutes on starting up, where I am assuming 
that it is streaming changes from the master that occurred between the 
basebackup and the startup. In this period it gets connection attempts and logs 
a FATAL message that it is starting up (so I am assuming that it does not 
process requests, which is fine).

I have just found at least one place where the new master warns in the 
following manner, basically after the new slave starts starting up, but before 
it completes it:

Sep 20 14:58:13 ... ERROR: prepared transaction with identifier 
"131077_AAAAAAAAAAAAAP//fwAAAd3KJH9X370UA1vQpzEwLjE3OS4yMjYuNjk=_AAAAAAAAAAAAAP//fwAAAd3KJH9X370UA1vQrgAAAAEAAAAA"
 is busy
Sep 20 14:58:13 ... ROLLBACK PREPARED 
'131077_AAAAAAAAAAAAAP//fwAAAd3KJH9X370UA1vQpzEwLjE3OS4yMjYuNjk=_AAAAAAAAAAAAAP//fwAAAd3KJH9X370UA1vQrgAAAAEAAAAA'


In order to recover from this situation, we basically wipe the system and start 
over from a backup. This is not *as* bad as it sounds, but pretty painful 
nonetheless. It should be noted that we have not spent a significant amount of 
time determinig the best way to recover, since we have been focused on how to 
avoid the situation altogether

I am not certain which xlog file it is having difficulties with?

Best regards,

Fredrik

On October 18, 2016 at 8:49:29 pm +02:00, Andres Freund <and...@anarazel.de> 
wrote:

> Hi,
> 
> On 2016-10-18 14:57:52 +0200, <fred...@huitfeldt.com> wrote:
> 
> > we are running many postgresql master/slave setups. The slaves are
> > initialised from a pg_basebackup from the master and are sync
> > streaming from the master. When we determine the master has failed,
> > the slave is promoted. Some time after that, the old master is again
> > initialised with a pg_basebackup and starts streaming from the new
> > master.
> > 
> Could you describe in a bit more detail how exactly you're setting up
> the standbys? E.g. the exact recovery.conf used, whether you remove any
> files during starting a standby. Also how exactly you're promoting
> standbys?
> 
> > Recently, we have gotten this error a fair amount of times: "out-of-order 
> > XID insertion in KnownAssignedXids" when postgresql attempts to start after 
> > being initialised with a pg_basebackup from the current master.
> >
> 
> 
> Which version are you encountering this on precisely?
> 
> 
> > Once the issue has occurred, a subsequent re-initialisation (with a 
> > completely new pg_basebackup) does not resolve the issue.
> > 
> 
> How have you recovered from this so far?
> 
> 
> > I have a setup in the failing state, so I can produce any kind of log 
> > mesages / details that would be helpful.
> > 
> 
> Could you use pg_xlogdump to dump the WAL file on which replay failed?
> And then attach the output in a compressed manner?
> 
> Greetings,
> 
> Andres Freund
>

Reply via email to