Once the node which was down is brought back the replication slot is not
turned active. The reason being replication slot is trying to create a
partition table which already exists. Because of this error replication
slot is stuck in inactive mode. Is there any way to ignore this error?
On 28-May-2016 4:56 PM, "Martín Marqués" <mar...@2ndquadrant.com> wrote:

> El 27/05/16 a las 06:33, Nikhil escribió:
> > Hello,
> >
> >
> > I have a BDR setup with two nodes. If I bring one node down i am seeing
> that
> > the replication slot is becoming inactive with below error.
>
> If you take down one of the nodes of a BDR mesh, the replication slots
> from each of the upstream nodes it connects to will switch to inactive.
> That's how replication slots work.
>
> > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL:
> >  streaming transactions committing after 0/111A91
> > 48, reading WAL from 0/110F03F8
> > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG:
> >  logical decoding found consistent point at 0/110F03
> > F8
> > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL:
> >  Logical decoding will begin using saved snapshot
> > .
> > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG:
> >  unexpected EOF on standby connection
>
> Downstream node got disconnected, which is sensible given that you took
> that node down.
>
> > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> > 0.437 ms
> > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> > 0.462 ms
> > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> > 0.096 ms
> > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> > 0.101 ms
> > <3462016-05-25 23:58:20 GMT%LOG:  starting background worker process "bdr
> > (6288505144157102317,1,16384,)->bdr (628851211361
> > 7339435,2,"
>
> It seems you brought up postgres on the downstream node again and it
> connected to the replication slot.
>
> > <798462016-05-25 23:58:20 GMT%ERROR:  relation
> "af_npx_device_l3_16_149_10"
> > already exists
>
> I'm not sure what happened here. Does that relation exist?
>
> Run \d+ af_npx_device_l3_16_149_10 with psql on both nodes.
>
> Also, did replication resume? Check with the lag query from the BDR
> documentation.
>
> Regards,
>
> --
> Martín Marqués                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

Reply via email to