We have run d2 instances with Kafka. They're currently unstable -- Amazon confirmed a host issue with d2 instances that gets tickled by a Kafka workload yesterday. Otherwise, it seems the d2 instance type is ideal as it gets an enormous amount of disk throughput and you'll likely be network bottlenecked.
Wes
Steven Wu <mailto:stevenz...@gmail.com> June 2, 2015 at 1:07 PM EBS (network attached storage) has got a lot better over the last a few years. we don't quite trust it for kafka workload. At Netflix, we were going with the new d2 instance type (HDD). our perf/load testing shows it satisfy our workload. SSD is better in latency curve but pretty comparable in terms of throughput. we can use the extra space from HDD for longer retention period. On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <h...@pinterest.com.invalid> Henry Cai <mailto:h...@pinterest.com.INVALID> June 2, 2015 at 12:37 PM We have been hosting kafka brokers in Amazon EC2 and we are using EBS disk. But periodically we were hit by long I/O wait time on EBS in some Availability Zones. We are thinking to change the instance types to a local HDD or local SSD. HDD is cheaper and bigger and seems quite fit for the Kafka use case which is mostly sequential read/write, but some early experiments show the HDD cannot catch up with the message producing speed since there are many topic/partitions on the broker which actually makes the disk I/O more randomly accessed. How are people's experience of choosing disk types on Amazon?