We have an open source framework you can use to spin up Kafka (any version or even any build you want) clusters (and Zookeeper) with CloudFormation on AWS https://github.com/stealthly/minotaur
It is very nice/handy you basically specify your instance types, counts, versions of code, etc and hit a <enter> https://github.com/stealthly/minotaur/tree/master/labs/kafka e.g. ./minotaur.py lab deploy kafka -e bdoss-dev -d testing -r us-east-1 -z us-east-1a -k http://example.com/kafka.tar.gz -n 3 -i m1.small There is some setup for the bastion host ( https://github.com/stealthly/minotaur/tree/master/infrastructure/aws/bastion) and supervisor (https://github.com/stealthly/minotaur/tree/master/supervisor) and after that it is really nice and easy. /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> ********************************************/ On Wed, Jan 14, 2015 at 2:54 PM, Joseph Lawson <jlaw...@roomkey.com> wrote: > We have a separate daemon process that assigns EIPs to servers when they > startup in an autoscaling group based off of an autoscaling message. So > for a cluster of 3 we have 3 EIPs. Then we inject the EIPs into startup > script for Kafka which checks to see if it has one of the EIPs and assigns > itself the index of that IP so in the list: > 10.0.0.1 10.0.0.2 10.0.0.3 > > 1 is broker 0, 2 is broker 1 and 3 is broker 2. All this is injected via > cloudformation and then we have a mod value so if we want to spin brokers > in the same group we do mod 1,2 and get brokers mod * 3 + index to > determine which is in the group. (the EIPs are different as it is a > different cloudformation) > > For redundancy make sure you run at least two that have full replicas of > all other partitions. We run replication factor of 3 with three instances > so if any goes down the other two bring it back in sync once a fresh server > spins in the autoscaling group. > > ________________________________________ > From: Dillian Murphey <crackshotm...@gmail.com> > Sent: Wednesday, January 14, 2015 2:42 PM > To: users@kafka.apache.org > Subject: kafka cluster on aws > > I can't seem to find much information to help me (being green to kafka) on > setting up a cluster on aws. Does anyone have any sources? > > The question I have off the bat is, what methods have already been explored > to generate a unique broker id? If I spin up a new server, do I just need > to maintain my own broker-id list somewhere so I don't re-use an already > allocated broker id? > > Also, I read an article about a broker going down and requiring a new > broker be spun up with the same id. Is this also something I need to > implement? > > I want to setup a kafka auto-scaling group on AWS, so I can add brokers at > well or based on load. It doesn't seem too complicated, or maybe I'm too > green to see it, but I don't want to re-invent everything myself. > > I know Loggly uses AWS/Kafka, so I'm hunting for more details on that too. > > Thanks for any help >