Daniel Nelson <mailto:daniel.nel...@vungle.com>
June 2, 2015 at 4:39 PM
On Jun 2, 2015, at 1:22 PM, Steven Wu<stevenz...@gmail.com>  wrote:
can you elaborate what kind of instability you have encountered?
We have seen the nodes become completely non-responsive. Usually they get 
rebooted automatically after 10-20 minutes, but occasionally they get stuck for 
days in a state where they cannot be rebooted via the Amazon APIs.

Same here. It was worse right after d2 launch. We had 6 out of 9 servers die within 10 hours after spinning them up. Amazon rolled out a fix, but we're still seeing similar issues, though not nearly as bad. The first fix was for something network related, and apparently sending lots of data through the instances caused a kernel panic on the host. We have no information yet about the current issue.

Wes

Steven Wu <mailto:stevenz...@gmail.com>
June 2, 2015 at 4:22 PM
Wes/Daniel,

can you elaborate what kind of instability you have encountered?

we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in the announcement, they did mention using Ubuntu 14.04 for better disk throughput. not sure whether 14.04 also addresses any instability issue you encountered or not.

Thanks,
Steven

In order to ensure the best disk throughput performance from your D2 instances on Linux, we recommend that you use the most recent version of the Amazon Linux AMI, or another Linux AMI with a kernel version of 3.8 or later. The D2 instances provide the best disk performance when you use a Linux kernel that supports Persistent Grants – an extension to the Xen block ring protocol that significantly improves disk throughput and scalability. The following Linux AMIs support this feature:

  * Amazon Linux AMI 2015.03 (HVM)
  * Ubuntu Server 14.04 LTS (HVM)
  * Red Hat Enterprise Linux 7.1 (HVM)
  * SUSE Linux Enterprise Server 12 (HVM)




Daniel Nelson <mailto:daniel.nel...@vungle.com>
June 2, 2015 at 2:42 PM

Do you have any workarounds for the d2 issues? We’ve been using them for our Kafkas too, and ran into the instability. We’re on Ubuntu 12.04 and plan to try on 14.04 with the latest HWE to see if that helps any.

Thanks!
Wes Chow <mailto:w...@chartbeat.com>
June 2, 2015 at 1:39 PM

We have run d2 instances with Kafka. They're currently unstable -- Amazon confirmed a host issue with d2 instances that gets tickled by a Kafka workload yesterday. Otherwise, it seems the d2 instance type is ideal as it gets an enormous amount of disk throughput and you'll likely be network bottlenecked.

Wes


Steven Wu <mailto:stevenz...@gmail.com>
June 2, 2015 at 1:07 PM
EBS (network attached storage) has got a lot better over the last a few
years. we don't quite trust it for kafka workload.

At Netflix, we were going with the new d2 instance type (HDD). our
perf/load testing shows it satisfy our workload. SSD is better in latency
curve but pretty comparable in terms of throughput. we can use the extra
space from HDD for longer retention period.

On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <h...@pinterest.com.invalid>

Reply via email to