Re: EBS SSD <-> Cassandra ?

Ben Bromhead Thu, 19 Jun 2014 22:50:41 -0700

Irrespective of performance and latency numbers there are fundamental flaws 
with using EBS/NAS and Cassandra, particularly around bandwidth contention and 
what happens when the shared storage medium breaks. Also obligatory reference 
to http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html.


Regarding ENI

AWS are pretty explicit about it’s impact on bandwidth:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html
Attaching another network interface to an instance is not a method to increase 
or double the network bandwidth to or from the dual-homed instance.

So Nate you are right in that it is a function of logical separation helps for 
some reason. 
 

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 20 Jun 2014, at 8:17 am, Nate McCall <n...@thelastpickle.com> wrote:

> Sorry - should have been clear I was speaking in terms of route optimizing, 
> not bandwidth. No idea as to the implementation (probably instance specific) 
> and I doubt it actually doubles bandwidth. 
> 
> Specifically: having an ENI dedicated to API traffic did smooth out some 
> recent load tests we did for a client. It could be that overall throughput 
> increases where more a function of cleaner traffic segmentation/smoother 
> routing. We werent being terribly scientific - was more an artifact of 
> testing network segmentation. 
> 
> I'm just going to say that "using an ENI will make things better" (since 
> traffic segmentation is always good practice anyway :)  YMMV. 
> 
> 
> 
> On Thu, Jun 19, 2014 at 3:39 PM, Russell Bradberry <rbradbe...@gmail.com> 
> wrote:
> does an elastic network interface really use a different physical network 
> interface? or is it just to give the ability for multiple ip addresses?
> 
> 
> 
> On June 19, 2014 at 3:56:34 PM, Nate McCall (n...@thelastpickle.com) wrote:
> 
>> If someone really wanted to try this it, I recommend adding an Elastic 
>> Network Interface or two for gossip and client/API traffic. This lets EBS 
>> and management traffic have the pre-configured network. 
>> 
>> 
>> On Thu, Jun 19, 2014 at 6:54 AM, Benedict Elliott Smith 
>> <belliottsm...@datastax.com> wrote:
>> I would say this is worth benchmarking before jumping to conclusions. The 
>> network being a bottleneck (or latency causing) for EBS is, to my knowledge, 
>> supposition, and instances can be started with direct connections to EBS if 
>> this is a concern. The blog post below shows that even without SSDs the 
>> EBS-optimised provisioned-IOPS instances show pretty consistent latency 
>> numbers, although those latencies are higher than you would typically expect 
>> from locally attached storage.
>> 
>> http://blog.parse.com/2012/09/17/parse-databases-upgraded-to-amazon-provisioned-iops/
>> 
>> Note, I'm not endorsing the use of EBS. Cassandra is designed to scale up 
>> with number of nodes, not with depth of nodes (as Ben mentions, saturating a 
>> single node's data capacity is pretty easy these days. CPUs rapidly become 
>> the bottleneck as you try to go deep). However the argument that EBS cannot 
>> provide consistent performance seems overly pessimistic, and should probably 
>> be empirically determined for your use case.
>> 
>> 
>> On Thu, Jun 19, 2014 at 9:50 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
>> Ok, looks fair enough.
>> 
>> Thanks guys. I would be great to be able to add disks when amount of data 
>> raises and add nodes when throughput increases... :)
>> 
>> 
>> 2014-06-19 5:27 GMT+02:00 Ben Bromhead <b...@instaclustr.com>:
>> 
>> http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningEC2_c.html
>> 
>> From the link:
>> 
>> EBS volumes are not recommended for Cassandra data volumes for the following 
>> reasons:
>> 
>> • EBS volumes contend directly for network throughput with standard packets. 
>> This means that EBS throughput is likely to fail if you saturate a network 
>> link.
>> • EBS volumes have unreliable performance. I/O performance can be 
>> exceptionally slow, causing the system to back load reads and writes until 
>> the entire cluster becomes unresponsive.
>> • Adding capacity by increasing the number of EBS volumes per host does not 
>> scale. You can easily surpass the ability of the system to keep effective 
>> buffer caches and concurrently serve requests for all of the data it is 
>> responsible for managing.
>> 
>> Still applies, especially the network contention and latency issues. 
>> 
>> Ben Bromhead
>> Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359
>> 
>> On 18 Jun 2014, at 7:18 pm, Daniel Chia <danc...@coursera.org> wrote:
>> 
>>> While they guarantee IOPS, they don't really make any guarantees about 
>>> latency. Since EBS goes over the network, there's so many things in the 
>>> path of getting at your data, I would be concerned with random latency 
>>> spikes, unless proven otherwise.
>>> 
>>> Thanks,
>>> Daniel
>>> 
>>> 
>>> On Wed, Jun 18, 2014 at 1:58 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
>>> In this document it is said :
>>> 
>>> Provisioned IOPS (SSD) - Volumes of this type are ideal for the most 
>>> demanding I/O intensive, transactional workloads and large relational or 
>>> NoSQL databases. This volume type provides the most consistent performance 
>>> and allows you to provision the exact level of performance you need with 
>>> the most predictable and consistent performance. With this type of volume 
>>> you provision exactly what you need, and pay for what you provision. Once 
>>> again, you can achieve up to 48,000 IOPS by connecting multiple volumes 
>>> together using RAID.
>>> 
>>> 
>>> 2014-06-18 10:57 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>:
>>> 
>>> Hi,
>>> 
>>> I just saw this : 
>>> http://aws.amazon.com/fr/blogs/aws/new-ssd-backed-elastic-block-storage/
>>> 
>>> Since the problem with EBS was the network, there is no chance that this 
>>> hardware architecture might be useful alongside Cassandra, right ?
>>> 
>>> Alain
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> -----------------
>> Nate McCall
>> Austin, TX
>> @zznate
>> 
>> Co-Founder & Sr. Technical Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
> 
> 
> 
> -- 
> -----------------
> Nate McCall
> Austin, TX
> @zznate
> 
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com

Re: EBS SSD <-> Cassandra ?

Reply via email to