Hello Joe, Just out of curiosity, how long did you let NiFi run while waiting for the nodes to connect?
On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <[email protected]> wrote: > Despite starting up, the nodes now cannot connect to each other, so they're > all listed as Disconnected in the UI. I see this in the logs: > > 2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new > session at /172.31.33.34:47224 > 2016-11-18 15:50:19,081 INFO [CommitProcessor:2] > o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bf9 > with negotiated timeout 4000 for client /172.31.33.34:47224 > 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new > session at /172.31.33.34:47228 > 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new > session at /172.31.33.34:47230 > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2] > o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfa > with negotiated timeout 4000 for client /172.31.33.34:47228 > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2] > o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfb > with negotiated timeout 4000 for client /172.31.33.34:47230 > 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new > session at /172.31.33.34:47234 > 2016-11-18 15:50:19,293 INFO [CommitProcessor:2] > o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfc > with negotiated timeout 4000 for client /172.31.33.34:47234 > > > However, I definitely did not open any ports similar to 47234 on my nifi > VMs. Is there a certain set of ports that need to be open between the > servers? My understanding was that only 2888, 3888, and 2121 were > necessary for zookeeper. > > On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <[email protected]> wrote: > > > It appears that if you try to start up just one node in a cluster with > > multiple zk hosts specified in zookeeper.properties, you get this error > > spammed at an incredible rate in your logs. When I started up all 3 > nodes > > at once, they didn't receive the error. > > > > On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <[email protected]> wrote: > > > >> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master > >> as of today. > >> > >> I was able to successfully start the 3-node cluster once, but then I > >> restarted it and get the following error spammed in the nifi-app.log. > >> > >> I'm not sure where to start debugging this, and I'm puzzled why it would > >> work once and then start giving me errors on the second restart. Has > >> anyone run into this error? > >> > >> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server > >> Started @83426ms > >> 2016-11-18 15:07:18,883 INFO [main] > org.apache.nifi.web.server.JettyServer > >> Loading Flow... > >> 2016-11-18 15:07:18,889 INFO [main] > org.apache.nifi.io.socket.SocketListener > >> Now listening for connections from nodes on port 9001 > >> 2016-11-18 15:07:19,117 INFO [main] > o.a.nifi.controller.StandardFlowService > >> Connecting Node: ip-172-31-33-34.ec2.internal:8443 > >> 2016-11-18 15:07:25,781 WARN [main] > o.a.nifi.controller.StandardFlowService > >> There is currently no Cluster Coordinator. This often happens upon > restart > >> of NiFi when running an embedded ZooKeeper. Will register this node to > >> become the active Cluster Coordinator and will attempt to connect to > >> cluster again > >> 2016-11-18 15:07:25,782 INFO [main] > o.a.n.c.l.e.CuratorLeaderElectionManager > >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader > >> Election for role 'Cluster Coordinator' but this role is already > registered > >> 2016-11-18 15:07:34,685 WARN [main] > o.a.nifi.controller.StandardFlowService > >> There is currently no Cluster Coordinator. This often happens upon > restart > >> of NiFi when running an embedded ZooKeeper. Will register this node to > >> become the active Cluster Coordinator and will attempt to connect to > >> cluster again > >> 2016-11-18 15:07:34,685 INFO [main] > o.a.n.c.l.e.CuratorLeaderElectionManager > >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader > >> Election for role 'Cluster Coordinator' but this role is already > registered > >> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0] > >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED > >> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0] > >> o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.lea > >> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a > >> Connection State changed to SUSPENDED > >> > >> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0] > >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave > >> uporg.apache.zookeeper.KeeperException$ConnectionLossException: > >> KeeperErrorCode = ConnectionLoss* > >> at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > >> ~[zookeeper-3.4.6.jar:3.4.6-1569965] > >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.check > >> BackgroundRetry(CuratorFrameworkImpl.java:728) > >> [curator-framework-2.11.0.jar:na] > >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo > >> rmBackgroundOperation(CuratorFrameworkImpl.java:857) > >> [curator-framework-2.11.0.jar:na] > >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg > >> roundOperationsLoop(CuratorFrameworkImpl.java:809) > >> [curator-framework-2.11.0.jar:na] > >> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces > >> s$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na] > >> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal > >> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na] > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >> [na:1.8.0_111] > >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu > >> tureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_111] > >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu > >> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111] > >> at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > >> [na:1.8.0_111] > >> at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > >> [na:1.8.0_111] > >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111] > >> > >> > >> -- > >> I know what it is to be in need, and I know what it is to have plenty. > I > >> have learned the secret of being content in any and every situation, > >> whether well fed or hungry, whether living in plenty or in want. I can > >> do all this through him who gives me strength. *-Philippians 4:12-13* > >> > > > > > > > > -- > > I know what it is to be in need, and I know what it is to have plenty. I > > have learned the secret of being content in any and every situation, > > whether well fed or hungry, whether living in plenty or in want. I can > > do all this through him who gives me strength. *-Philippians 4:12-13* > > > > > > -- > I know what it is to be in need, and I know what it is to have plenty. I > have learned the secret of being content in any and every situation, > whether well fed or hungry, whether living in plenty or in want. I can do > all this through him who gives me strength. *-Philippians 4:12-13* >
