Daniel Nelson <mailto:daniel.nel...@vungle.com>
June 2, 2015 at 4:39 PM
On Jun 2, 2015, at 1:22 PM, Steven Wu<stevenz...@gmail.com> wrote:
can you elaborate what kind of instability you have encountered?
We have seen the nodes become completely non-responsive. Usually they get
rebooted automatically after 10-20 minutes, but occasionally they get stuck for
days in a state where they cannot be rebooted via the Amazon APIs.
Same here. It was worse right after d2 launch. We had 6 out of 9 servers
die within 10 hours after spinning them up. Amazon rolled out a fix, but
we're still seeing similar issues, though not nearly as bad. The first
fix was for something network related, and apparently sending lots of
data through the instances caused a kernel panic on the host. We have no
information yet about the current issue.
Wes
Steven Wu <mailto:stevenz...@gmail.com>
June 2, 2015 at 4:22 PM
Wes/Daniel,
can you elaborate what kind of instability you have encountered?
we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in
the announcement, they did mention using Ubuntu 14.04 for better disk
throughput. not sure whether 14.04 also addresses any instability
issue you encountered or not.
Thanks,
Steven
In order to ensure the best disk throughput performance from your
D2 instances on Linux, we recommend that you use the most recent
version of the Amazon Linux AMI, or another Linux AMI with a kernel
version of 3.8 or later. The D2 instances provide the best disk
performance when you use a Linux kernel that supports Persistent
Grants – an extension to the Xen block ring protocol that
significantly improves disk throughput and scalability. The following
Linux AMIs support this feature:
* Amazon Linux AMI 2015.03 (HVM)
* Ubuntu Server 14.04 LTS (HVM)
* Red Hat Enterprise Linux 7.1 (HVM)
* SUSE Linux Enterprise Server 12 (HVM)
Daniel Nelson <mailto:daniel.nel...@vungle.com>
June 2, 2015 at 2:42 PM
Do you have any workarounds for the d2 issues? We’ve been using them
for our Kafkas too, and ran into the instability. We’re on Ubuntu
12.04 and plan to try on 14.04 with the latest HWE to see if that
helps any.
Thanks!
Wes Chow <mailto:w...@chartbeat.com>
June 2, 2015 at 1:39 PM
We have run d2 instances with Kafka. They're currently unstable --
Amazon confirmed a host issue with d2 instances that gets tickled by a
Kafka workload yesterday. Otherwise, it seems the d2 instance type is
ideal as it gets an enormous amount of disk throughput and you'll
likely be network bottlenecked.
Wes
Steven Wu <mailto:stevenz...@gmail.com>
June 2, 2015 at 1:07 PM
EBS (network attached storage) has got a lot better over the last a few
years. we don't quite trust it for kafka workload.
At Netflix, we were going with the new d2 instance type (HDD). our
perf/load testing shows it satisfy our workload. SSD is better in latency
curve but pretty comparable in terms of throughput. we can use the extra
space from HDD for longer retention period.
On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <h...@pinterest.com.invalid>