Once the node which was down is brought back the replication slot is not turned active. The reason being replication slot is trying to create a partition table which already exists. Because of this error replication slot is stuck in inactive mode. Is there any way to ignore this error? On 28-May-2016 4:56 PM, "Martín Marqués" <mar...@2ndquadrant.com> wrote:
> El 27/05/16 a las 06:33, Nikhil escribió: > > Hello, > > > > > > I have a BDR setup with two nodes. If I bring one node down i am seeing > that > > the replication slot is becoming inactive with below error. > > If you take down one of the nodes of a BDR mesh, the replication slots > from each of the upstream nodes it connects to will switch to inactive. > That's how replication slots work. > > > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL: > > streaming transactions committing after 0/111A91 > > 48, reading WAL from 0/110F03F8 > > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG: > > logical decoding found consistent point at 0/110F03 > > F8 > > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL: > > Logical decoding will begin using saved snapshot > > . > > <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG: > > unexpected EOF on standby connection > > Downstream node got disconnected, which is sensible given that you took > that node down. > > > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG: duration: > > 0.437 ms > > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG: duration: > > 0.462 ms > > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG: duration: > > 0.096 ms > > <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG: duration: > > 0.101 ms > > <3462016-05-25 23:58:20 GMT%LOG: starting background worker process "bdr > > (6288505144157102317,1,16384,)->bdr (628851211361 > > 7339435,2," > > It seems you brought up postgres on the downstream node again and it > connected to the replication slot. > > > <798462016-05-25 23:58:20 GMT%ERROR: relation > "af_npx_device_l3_16_149_10" > > already exists > > I'm not sure what happened here. Does that relation exist? > > Run \d+ af_npx_device_l3_16_149_10 with psql on both nodes. > > Also, did replication resume? Check with the lag query from the BDR > documentation. > > Regards, > > -- > Martín Marqués http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >