Steven, Do you have the AWS case # (or the Ubuntu bug/case #) when you hit that kernel panic issue?
Our company will still be running on AMI image 12.04 for a while, I will see whether the fix was also ported onto Ubuntu 12.04 On Tue, Jun 2, 2015 at 2:53 PM, Steven Wu <stevenz...@gmail.com> wrote: > now I remember we had same kernel panic issue in the first week of D2 > rolling-out. then AWS fixed it and we haven't seen any issue since. try > Ubuntu 14.04 and see if it resolves your remaining kernel/instability issue. > > On Tue, Jun 2, 2015 at 2:30 PM, Wes Chow <w...@chartbeat.com> wrote: > >> >> Daniel Nelson <daniel.nel...@vungle.com> >> June 2, 2015 at 4:39 PM >> >> On Jun 2, 2015, at 1:22 PM, Steven Wu <stevenz...@gmail.com> >> <stevenz...@gmail.com> wrote: >> >> can you elaborate what kind of instability you have encountered? >> >> We have seen the nodes become completely non-responsive. Usually they get >> rebooted automatically after 10-20 minutes, but occasionally they get stuck >> for days in a state where they cannot be rebooted via the Amazon APIs. >> >> >> Same here. It was worse right after d2 launch. We had 6 out of 9 servers >> die within 10 hours after spinning them up. Amazon rolled out a fix, but >> we're still seeing similar issues, though not nearly as bad. The first fix >> was for something network related, and apparently sending lots of data >> through the instances caused a kernel panic on the host. We have no >> information yet about the current issue. >> >> Wes >> >> Steven Wu <stevenz...@gmail.com> >> June 2, 2015 at 4:22 PM >> Wes/Daniel, >> >> can you elaborate what kind of instability you have encountered? >> >> we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in >> the announcement, they did mention using Ubuntu 14.04 for better disk >> throughput. not sure whether 14.04 also addresses any instability issue you >> encountered or not. >> >> Thanks, >> Steven >> >> In order to ensure the best disk throughput performance from your D2 >> instances >> on Linux, we recommend that you use the most recent version of the Amazon >> Linux AMI, or another Linux AMI with a kernel version of 3.8 or later. The >> D2 instances provide the best disk performance when you use a Linux >> kernel that supports Persistent Grants – an extension to the Xen block ring >> protocol that significantly improves disk throughput and scalability. The >> following Linux AMIs support this feature: >> >> - Amazon Linux AMI 2015.03 (HVM) >> - Ubuntu Server 14.04 LTS (HVM) >> - Red Hat Enterprise Linux 7.1 (HVM) >> - SUSE Linux Enterprise Server 12 (HVM) >> >> >> >> >> Daniel Nelson <daniel.nel...@vungle.com> >> June 2, 2015 at 2:42 PM >> >> Do you have any workarounds for the d2 issues? We’ve been using them for >> our Kafkas too, and ran into the instability. We’re on Ubuntu 12.04 and >> plan to try on 14.04 with the latest HWE to see if that helps any. >> >> Thanks! >> Wes Chow <w...@chartbeat.com> >> June 2, 2015 at 1:39 PM >> >> We have run d2 instances with Kafka. They're currently unstable -- Amazon >> confirmed a host issue with d2 instances that gets tickled by a Kafka >> workload yesterday. Otherwise, it seems the d2 instance type is ideal as it >> gets an enormous amount of disk throughput and you'll likely be network >> bottlenecked. >> >> Wes >> >> >> Steven Wu <stevenz...@gmail.com> >> June 2, 2015 at 1:07 PM >> EBS (network attached storage) has got a lot better over the last a few >> years. we don't quite trust it for kafka workload. >> >> At Netflix, we were going with the new d2 instance type (HDD). our >> perf/load testing shows it satisfy our workload. SSD is better in latency >> curve but pretty comparable in terms of throughput. we can use the extra >> space from HDD for longer retention period. >> >> On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <h...@pinterest.com.invalid> >> <h...@pinterest.com.invalid> >> >> >