There is no leader in cassandra. I suggest you ask Azkaban community about intgteation with Azkaban and Azkaban HA.
On Sunday, August 16, 2015, Vikram Kone <vikramk...@gmail.com> wrote: > Can't we use zoo keeper for leader election in Cassandra and based on who > is leader ..run azkaban or any app instance for that matter on that > Cassandra server. I'm thinking that I can copy the applocation folder to > all nodes and then determine which one to run using zookeeper. Is that > possible ? > > Sent from Outlook <http://aka.ms/Ox5hz3> > > > > > On Sun, Aug 16, 2015 at 6:47 AM -0700, "John Wong" <gokoproj...@gmail.com > <javascript:_e(%7B%7D,'cvml','gokoproj...@gmail.com');>> wrote: > > Hi >> >> I am not familiar with Azkaban and probably a better question to the >> Azkaban community IMO. But there seems to be two modes ( >> http://azkaban.github.io/azkaban/docs/2.5/) one is solo and one is >> two-server mode, but either way I think still SPOF? If there is no >> election, just based on process, my 2 cents would be monitor, alert, and >> start the process somewhere else. Better yet, don't install the process on >> Cassandra node. Keep your instance for one purpose only. If you run cloud >> like AWS you will be able to autoscale min1 max1 easily. >> >> >> Note: In peer-to-peer architecture, there is simply no concept of master. >> You can start with some seed nodes for discovery. It depends how you design >> discovery. >> >> On Sat, Aug 15, 2015 at 11:49 AM, Vikram Kone <vikramk...@gmail.com >> <javascript:_e(%7B%7D,'cvml','vikramk...@gmail.com');>> wrote: >> >>> Hi, >>> We are planning to install Azkaban in solo server mode on a 24 >>> node cassandra cluster to be able to schedule spark jobs with intricate >>> dependency chain. The problem, is since Cassandra has a no-SPOF >>> architecture ie any node can become the master for the cluster, it creates >>> the problem for Azkaban master since it's not a peer-peer architecture >>> where any node can become the master. Only a single mode has to be master >>> at any given time. >>> >>> What are our options here? Are there any framworks or tools out there >>> that would allow any application to run on a cluster of machines with high >>> availablity? >>> Should I be looking at something like zookeeper for this ? Or Mesos may >>> be? >> >> >> -- Sent from Jeff Dean's printf() mobile console