Despite starting up, the nodes now cannot connect to each other, so they're all listed as Disconnected in the UI. I see this in the logs:
2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /172.31.33.34:47224 2016-11-18 15:50:19,081 INFO [CommitProcessor:2] o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bf9 with negotiated timeout 4000 for client /172.31.33.34:47224 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /172.31.33.34:47228 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /172.31.33.34:47230 2016-11-18 15:50:19,187 INFO [CommitProcessor:2] o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfa with negotiated timeout 4000 for client /172.31.33.34:47228 2016-11-18 15:50:19,187 INFO [CommitProcessor:2] o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfb with negotiated timeout 4000 for client /172.31.33.34:47230 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new session at /172.31.33.34:47234 2016-11-18 15:50:19,293 INFO [CommitProcessor:2] o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfc with negotiated timeout 4000 for client /172.31.33.34:47234 However, I definitely did not open any ports similar to 47234 on my nifi VMs. Is there a certain set of ports that need to be open between the servers? My understanding was that only 2888, 3888, and 2121 were necessary for zookeeper. On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <[email protected]> wrote: > It appears that if you try to start up just one node in a cluster with > multiple zk hosts specified in zookeeper.properties, you get this error > spammed at an incredible rate in your logs. When I started up all 3 nodes > at once, they didn't receive the error. > > On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <[email protected]> wrote: > >> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master >> as of today. >> >> I was able to successfully start the 3-node cluster once, but then I >> restarted it and get the following error spammed in the nifi-app.log. >> >> I'm not sure where to start debugging this, and I'm puzzled why it would >> work once and then start giving me errors on the second restart. Has >> anyone run into this error? >> >> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server >> Started @83426ms >> 2016-11-18 15:07:18,883 INFO [main] org.apache.nifi.web.server.JettyServer >> Loading Flow... >> 2016-11-18 15:07:18,889 INFO [main] org.apache.nifi.io.socket.SocketListener >> Now listening for connections from nodes on port 9001 >> 2016-11-18 15:07:19,117 INFO [main] o.a.nifi.controller.StandardFlowService >> Connecting Node: ip-172-31-33-34.ec2.internal:8443 >> 2016-11-18 15:07:25,781 WARN [main] o.a.nifi.controller.StandardFlowService >> There is currently no Cluster Coordinator. This often happens upon restart >> of NiFi when running an embedded ZooKeeper. Will register this node to >> become the active Cluster Coordinator and will attempt to connect to >> cluster again >> 2016-11-18 15:07:25,782 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader >> Election for role 'Cluster Coordinator' but this role is already registered >> 2016-11-18 15:07:34,685 WARN [main] o.a.nifi.controller.StandardFlowService >> There is currently no Cluster Coordinator. This often happens upon restart >> of NiFi when running an embedded ZooKeeper. Will register this node to >> become the active Cluster Coordinator and will attempt to connect to >> cluster again >> 2016-11-18 15:07:34,685 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader >> Election for role 'Cluster Coordinator' but this role is already registered >> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0] >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED >> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0] >> o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.lea >> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a >> Connection State changed to SUSPENDED >> >> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0] >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave >> uporg.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss* >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> ~[zookeeper-3.4.6.jar:3.4.6-1569965] >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.check >> BackgroundRetry(CuratorFrameworkImpl.java:728) >> [curator-framework-2.11.0.jar:na] >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo >> rmBackgroundOperation(CuratorFrameworkImpl.java:857) >> [curator-framework-2.11.0.jar:na] >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg >> roundOperationsLoop(CuratorFrameworkImpl.java:809) >> [curator-framework-2.11.0.jar:na] >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces >> s$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na] >> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal >> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na] >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> [na:1.8.0_111] >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_111] >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> [na:1.8.0_111] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> [na:1.8.0_111] >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111] >> >> >> -- >> I know what it is to be in need, and I know what it is to have plenty. I >> have learned the secret of being content in any and every situation, >> whether well fed or hungry, whether living in plenty or in want. I can >> do all this through him who gives me strength. *-Philippians 4:12-13* >> > > > > -- > I know what it is to be in need, and I know what it is to have plenty. I > have learned the secret of being content in any and every situation, > whether well fed or hungry, whether living in plenty or in want. I can > do all this through him who gives me strength. *-Philippians 4:12-13* > -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13*
