I'm also interested in hearing more about deploying Kafka in AWS.
I was also considering options like your 1a and 2. I ran some calculations and
one interesting thing I ran across was bandwidth costs between AZs.
In 1a, if you can have your producers and consumers in the same AZ as the
"master"
OK, yeah, speaking from experience I would be comfortable with using the
ephemeral storage if it's replicated across AZs. More and more EC2 instances
have local SSDs, so you'll get great IO. Of course, you better monitor your
instance, and if a instance terminates, you're vulnerable if a second
I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.
To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge fo
If only Kafka had rack awarenessyou could run 1 cluster and set up the
replicas in different AZs.
https://issues.apache.org/jira/browse/KAFKA-1215
As for your question about ephemeral versus EBS, I presume you are proposing to
use ephemeral *with* replicas, right?
Philip
---
We're planning a deploy to AWS EC2, and I was hoping to get some advice on
best practices. I've seen the Loggly presentation [1], which has some good
recommendations on instance types and EBS setup. Aside from that, there
seem to be several options in terms of multi-Availability Zone (AZ)
deploymen