Hey Zuber, Our AWS ZK deployment involves a subnet that is not used for other things, fixed private IP addresses, and EBS volumes for ZK data. That way, if a ZK instance fails, it can be replaced with another instance with the same IP and data volume.
On Wed, Aug 3, 2016 at 7:22 AM, Zuber <objectsp...@gmail.com> wrote: > Hello – > > We are planning to use Kafka as Event Store in a system which is being > built using event sourcing design approach. > Here is how we deployed the cluster in AWS to verify HA in the cloud (in > our POC we only had 1 topic with 1 partition and 3 replication factor) - > 1) 3 ZK servers running in different AZs (managed by Auto Scaling Group) > 2) 3 Kafka brokers EC2 running in different AZs (managed by Auto > Scaling Group) > 3) Kafka logs are stored in EBS volumes > 4) A type addresses are defined for all ZK servers & Kafka brokers in > Route53 > EC2 instance registers its IP for corresponding A type address (in > Route53) on startup > > But due a bug in ZKClient used by Kafka broker which caches ZK IP forever, > I don’t see any other option other than bouncing all brokers. > > One of the Netflix presentation (following links) mentions about the issue > as well as couple of ZK JIRA defects but I haven’t found any concrete > solution yet. > I would really appreciate any help in this regard. > > > http://image.slidesharecdn.com/netflix-kafka-150325105558-conversion-gate01/95/netflix-data-pipeline-with-kafka-36-638.jpg?cb=1427281139 > > http://image.slidesharecdn.com/netflix-kafka-150325105558-conversion-gate01/95/netflix-data-pipeline-with-kafka-36-638.jpg?cb=1427281139 > https://issues.apache.org/jira/browse/ZOOKEEPER-338 > https://issues.apache.org/jira/browse/ZOOKEEPER-1506 > http://grokbase.com/t/kafka/users/131x67h1bt/zookeeper-caching-dns-entries > > Thanks, > Zuber > >