I think I spotted the problem today, I am missing a role on node2. Is there a pointer to or can you provide list of steps to take for the manual cleanup mentioned the log file. I am assuming I just need to remove the relevant entries in the bdr tables just on node2 in my case. Is that correct?
> On May 15, 2015, at 8:07 AM, Dennis <denn...@visi.com> wrote: > > On node1: > > apimgtdb=# SELECT * FROM bdr.bdr_nodes > apimgtdb-# ; > node_sysid | node_timeline | node_dboid | node_status | node_name | > node_local_dsn > | node_init_from_dsn > ---------------------+---------------+------------+-------------+-----------+---------------------------------------------------------------------------- > --+-------------------- > 6145502501690488441 | 2 | 16424 | r | node1 | > port=5432 dbname=apimgtdb user=postgres password=XXXX host=10.0.0.101 > | > (1 row) > > apimgtdb=# SELECT * FROM bdr.bdr_connections > apimgtdb-# ; > conn_sysid | conn_timeline | conn_dboid | conn_origin_sysid | > conn_origin_timeline | conn_origin_dboid | conn_is_unidirectional | > > conn_dsn | > conn_apply_delay | conn_replication_sets > ---------------------+---------------+------------+-------------------+----------------------+-------------------+------------------------+-------------- > ----------------------------------------------------------------+------------------+------------------------ > 6145502501690488441 | 2 | 16424 | 0 | > 0 | 0 | f | port=5432 dbn > ame=apimgtdb user=postgres password=XXXXX host=10.0.0.101 | > | {test_rep_set} > (1 row) > > > < 2015-05-15 05:47:47.070 PDT >ERROR: No peer nodes or peer node count > unknown, cannot acquire DDL lock > < 2015-05-15 05:47:47.070 PDT >HINT: BDR is probably still starting up, wait > a while > < 2015-05-15 05:47:47.070 PDT >CONTEXT: SQL statement "SECURITY LABEL FOR > bdr ON TABLE idn_oauth2_access_token IS '{ "sets" : [“test_rep_set"] }'" > PL/pgSQL function bdr.table_set_replication_sets(regclass,text[]) > line 30 at EXECUTE statement > < 2015-05-15 05:47:47.070 PDT >STATEMENT: select > bdr.table_set_replication_sets(’table1', ARRAY[’test_rep_set']); > > > > On node2: > > apimgtdb=# SELECT * FROM bdr.bdr_nodes; > node_sysid | node_timeline | node_dboid | node_status | node_name | > node_local_dsn > | node_init_from_dsn > > ---------------------+---------------+------------+-------------+-----------+------------------------------------------------------------------------- > ----+------------------------------------------------------------------------------ > 6148366974419236867 | 1 | 16386 | i | node2 | > port=5432 dbname=apimgtdb user=postgres password=XXXX host=10.0.0.102 > | port=5432 dbname=apimgtdb user=postgres password=XXXX host=10.0.0.101 > (1 row) > > > apimgtdb=# SELECT * FROM bdr.bdr_connections; > conn_sysid | conn_timeline | conn_dboid | conn_origin_sysid | > conn_origin_timeline | conn_origin_dboid | conn_is_unidirectional | > conn_dsn | > conn_apply_delay | conn_replication_sets > ---------------------+---------------+------------+-------------------+----------------------+-------------------+------------------------+----------- > ------------------------------------------------------------------+------------------+------------------------ > 6148366974419236867 | 1 | 16386 | 0 | > 0 | 0 | f | port=5432 > dbname=apimgtdb user=postgres password=XXXX host=10.0.0.102 | > | {test_rep_set} > (1 row) > > The following lines repeat in the logs on node2: > > < 2015-05-15 09:04:25.874 EDT >LOG: worker process: bdr db: apimgtdb (PID > 16330) exited with exit code 1 > < 2015-05-15 09:04:30.880 EDT >LOG: starting background worker process "bdr > db: apimgtdb" > < 2015-05-15 09:04:30.915 EDT >ERROR: previous init failed, manual cleanup > is required > < 2015-05-15 09:04:30.915 EDT >DETAIL: Found bdr.bdr_nodes entry for bdr > (6148366974419236867,1,16386,) with state=i in remote bdr.bdr_nodes > < 2015-05-15 09:04:30.915 EDT >HINT: Remove all replication identifiers and > slots corresponding to this node from the init target node then drop and > recreate this database and try again > < 2015-05-15 09:04:30.916 EDT >LOG: worker process: bdr db: apimgtdb (PID > 16338) exited with exit code 1 > > >> On May 15, 2015, at 2:30 AM, Craig Ringer <cr...@2ndquadrant.com >> <mailto:cr...@2ndquadrant.com>> wrote: >> >> SELECT * FROM bdr.bdr_connections >