Re: EC2 storage options for C*

Steve Robenalt Mon, 01 Feb 2016 14:23:25 -0800

Hi Jeff,

I'm going to go back and review your presentation. I missed it at Cassandra
Summit and didn't make it to re:Invent last year. The opinion I voiced was
from my own direct experience. Didn't mean to imply that there weren't
other good options available.


Thanks,
Steve

On Mon, Feb 1, 2016 at 2:12 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> A lot of people use the old gen instances (m1 in particular) because they
> came with a ton of effectively free ephemeral storage (up to 1.6TB).
> Whether or not they’re viable is a decision for each user to make. They’re
> very, very commonly used for C*, though. At a time when EBS was not
> sufficiently robust or reliable, a cluster of m1 instances was the de facto
> standard.
>
> The canonical “best practice” in 2015 was i2. We believe we’ve made a
> compelling argument to use m4 or c4 instead of i2. There exists a company
> we know currently testing d2 at scale, though I’m not sure they have much
> in terms of concrete results at this time.
>
> - Jeff
>
> From: Jack Krupansky
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, February 1, 2016 at 1:55 PM
>
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
> Dense Storage".
>
> The remaining question is whether any of the "Previous Generation
> Instances" should be publicly recommended going forward.
>
> And whether non-SSD instances should be recommended going forward as well.
> sure, technically, someone could use the legacy instances, but the question
> is what we should be recommending as best practice going forward.
>
> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sroben...@highwire.org>
> wrote:
>
>> Hi Jack,
>>
>> At the bottom of the instance-types page, there is a link to the previous
>> generations, which includes the older series (m1, m2, etc), many of which
>> have HDD options.
>>
>> There are also the d2 (Dense Storage) instances in the current generation
>> that include various combos of local HDDs.
>>
>> The i2 series has good sized SSDs available, and has the advanced
>> networking option, which is also useful for Cassandra. The enhanced
>> networking is available with other instance types as well, as you'll see on
>> the feature list under each type.
>>
>> Steve
>>
>>
>>
>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky <jack.krupan...@gmail.com>
>> wrote:
>>
>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic
>>> question, it seems like magnetic (HDD) is no longer a recommended storage
>>> option for databases on AWS. In particular, only the C2 Dense Storage
>>> instances have local magnetic storage - all the other instance types are
>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent Data
>>> Access."
>>>
>>> For the record, that AWS doc has Cassandra listed as a use case for i2
>>> instance types.
>>>
>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and gp2
>>> only for the "small to medium databases" use case.
>>>
>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is
>>> the doc simply for any newly started instances?
>>>
>>> See:
>>> https://aws.amazon.com/ec2/instance-types/
>>> http://aws.amazon.com/ebs/details/
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>> wrote:
>>>
>>>> > My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> Virtually all of them are covered.
>>>>
>>>> > I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Not published,and probably not a straight forward answer (probably have
>>>> redundancy cross-az, if it matches some of their other published
>>>> behaviors). The promise they give you is ‘iops’, with a certain block size.
>>>> Some instance types are optimized for dedicated, ebs-only network
>>>> interfaces. Like most things in cassandra / cloud, the only way to know for
>>>> sure is to test it yourself and see if observed latency is acceptable (or
>>>> trust our testing, if you assume we’re sufficiently smart and honest).
>>>>
>>>> > Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> We tested dozens of instance type/size combinations (literally). The
>>>> best performance was clearly with ebs-optimized instances that also have
>>>> enhanced networking (c4, m4, etc) - slide 43
>>>>
>>>> > SSD or magnetic or does it make any difference?
>>>>
>>>> SSD, GP2 (slide 64)
>>>>
>>>> > What info is available on EBS performance at peak times, when
>>>> multiple AWS customers have spikes of demand?
>>>>
>>>> Not published, but experiments show that we can hit 10k iops all day
>>>> every day with only trivial noisy neighbor problems, not enough to impact a
>>>> real cluster (slide 58)
>>>>
>>>> > Is RAID much of a factor or help at all using EBS?
>>>>
>>>> You can use RAID to get higher IOPS than you’d normally get by default
>>>> (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you need more
>>>> than 10k, you can stripe volumes together up to the ebs network link max)
>>>> (hinted at in slide 64)
>>>>
>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, with
>>>> a properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally
>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link to
>>>> EBS network fails, for example). Occasionally instances will have issues.
>>>> The volume-specific issues seem to be less common than the instance-store
>>>> “instance retired” or “instance is running on degraded hardware” events.
>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with
>>>> instance store). The assurances are in AWS’ SLA – if the SLA is
>>>> insufficient (and it probably is insufficient), use more than one AZ and/or
>>>> AWS region or cloud vendor.
>>>>
>>>> > For multi-data center operation, what configuration options assure
>>>> that the EBS volumes for each DC are truly physically separated?
>>>>
>>>> It used to be true that EBS control plane for a given region spanned
>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ are
>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a
>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide 65
>>>>
>>>> > In terms of syncing data for the commit log, if the OS call to sync
>>>> an EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>> Most of the answers in this block are “probably not 100%, you should be
>>>> writing to more than one host/AZ/DC/vendor to protect your organization
>>>> from failures”. AWS targets something like 0.1% annual failure rate per
>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding
>>>> those goals (at least based with the petabytes of data we have on gp2
>>>> volumes).
>>>>
>>>>
>>>>
>>>> From: Jack Krupansky
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Monday, February 1, 2016 at 5:51 AM
>>>>
>>>> To: "user@cassandra.apache.org"
>>>> Subject: Re: EC2 storage options for C*
>>>>
>>>> I'm not a fan of guy - this appears to be the slideshare corresponding
>>>> to the video:
>>>>
>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second
>>>>
>>>> My apologies if my questions are actually answered on the video or
>>>> slides, I just did a quick scan of the slide text.
>>>>
>>>> I'm curious where the EBS physical devices actually reside - are they
>>>> in the same rack, the same data center, same availability zone? I mean,
>>>> people try to minimize network latency between nodes, so how exactly is EBS
>>>> able to avoid network latency?
>>>>
>>>> Did your test use Amazon EBS–Optimized Instances?
>>>>
>>>> SSD or magnetic or does it make any difference?
>>>>
>>>> What info is available on EBS performance at peak times, when multiple
>>>> AWS customers have spikes of demand?
>>>>
>>>> Is RAID much of a factor or help at all using EBS?
>>>>
>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with a
>>>> properly configured Cassandra cluster RF provides HA, so what is the
>>>> equivalent for EBS? If I have RF=3, what assurance is there that those
>>>> three EBS volumes aren't all in the same physical rack?
>>>>
>>>> For multi-data center operation, what configuration options assure that
>>>> the EBS volumes for each DC are truly physically separated?
>>>>
>>>> In terms of syncing data for the commit log, if the OS call to sync an
>>>> EBS volume returns, is the commit log data absolutely 100% synced at the
>>>> hardware level on the EBS end, such that a power failure of the systems on
>>>> which the EBS volumes reside will still guarantee availability of the
>>>> fsynced data. As well, is return from fsync an absolute guarantee of
>>>> sstable durability when Cassandra is about to delete the commit log,
>>>> including when the two are on different volumes? In practice, we would like
>>>> some significant degree of pipelining of data, such as during the full
>>>> processing of flushing memtables, but for the fsync at the end a solid
>>>> guarantee is needed.
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.pl...@gmail.com>
>>>> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not
>>>>> discounting EBS, but prior outages are worrisome.
>>>>>
>>>>>
>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>>>> wrote:
>>>>>
>>>>>> Free to choose what you'd like, but EBS outages were also addressed
>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't
>>>>>> the same as 2011 EBS.
>>>>>>
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.pl...@gmail.com> wrote:
>>>>>>
>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral.
>>>>>> GP2 after testing is a viable contender for our workload. The only worry 
>>>>>> I
>>>>>> have is EBS outages, which have happened.
>>>>>>
>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also in that video - it's long but worth watching
>>>>>>>
>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to
>>>>>>> ensure we weren't "just" reading from memory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Jirsa
>>>>>>>
>>>>>>>
>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <
>>>>>>> jack.krupan...@gmail.com> wrote:
>>>>>>>
>>>>>>> How about reads? Any differences between read-intensive and
>>>>>>> write-intensive workloads?
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <
>>>>>>> jeff.ji...@crowdstrike.com> wrote:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M
>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50%
>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is not
>>>>>>>> necessary.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From: John Wong
>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM
>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>> Subject: Re: EC2 storage options for C*
>>>>>>>>
>>>>>>>> For production I'd stick with ephemeral disks (aka instance
>>>>>>>> storage) if you have running a lot of transaction.
>>>>>>>> However, for regular small testing/qa cluster, or something you
>>>>>>>> know you want to reload often, EBS is definitely good enough and we 
>>>>>>>> haven't
>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush blocked.
>>>>>>>>
>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through
>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your
>>>>>>>> production cluster?
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind
>>>>>>>>> of disk performance you need?". If you need the performance, its hard 
>>>>>>>>> to
>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested
>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of 
>>>>>>>>> headache.
>>>>>>>>>
>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found
>>>>>>>>> our choice of instance dictated much more by the balance of price, 
>>>>>>>>> CPU, and
>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the 
>>>>>>>>> disk is
>>>>>>>>> rarely the bottleneck. YMMV, of course.
>>>>>>>>>
>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <
>>>>>>>>> jeff.ji...@crowdstrike.com> wrote:
>>>>>>>>>
>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4
>>>>>>>>>> instances with GP2 EBS.  When you don’t care about replacing a node 
>>>>>>>>>> because
>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is
>>>>>>>>>> capable of amazing things, and greatly simplifies life.
>>>>>>>>>>
>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS
>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very
>>>>>>>>>> much a viable option, despite any old documents online that say 
>>>>>>>>>> otherwise.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> From: Eric Plowe
>>>>>>>>>> Reply-To: "user@cassandra.apache.org"
>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM
>>>>>>>>>> To: "user@cassandra.apache.org"
>>>>>>>>>> Subject: EC2 storage options for C*
>>>>>>>>>>
>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We are
>>>>>>>>>> thinking about going with ephemeral SSDs. The question is this: 
>>>>>>>>>> Should we
>>>>>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in 
>>>>>>>>>> our
>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are 
>>>>>>>>>> happy with
>>>>>>>>>> the performance we are seeing thus far.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Eric
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> sroben...@highwire.org <bza...@highwire.org>
>> (office/cell): 916-505-1785
>>
>> HighWire Press, Inc.
>> 425 Broadway St, Redwood City, CA 94063
>> www.highwire.org
>>
>> Technology for Scholarly Communication
>>
>
>


-- 
Steve Robenalt
Software Architect
sroben...@highwire.org <bza...@highwire.org>
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication

Re: EC2 storage options for C*

Reply via email to