Just curious here ... when did EBS become OK for C*? Didn't they always push towards using ephemeral disks?
On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead <b...@instaclustr.com> wrote: > For what it's worth we've tried d2 instances and they encourage terrible > things like super dense nodes (increases your replacement time). In terms > of useable storage I would go with gp2 EBS on a m4 based instance. > > On Mon, 1 Feb 2016 at 14:25 Jack Krupansky <jack.krupan...@gmail.com> > wrote: > >> Ah, yes, the good old days of m1.large. >> >> -- Jack Krupansky >> >> On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> >> wrote: >> >>> A lot of people use the old gen instances (m1 in particular) because >>> they came with a ton of effectively free ephemeral storage (up to 1.6TB). >>> Whether or not they’re viable is a decision for each user to make. They’re >>> very, very commonly used for C*, though. At a time when EBS was not >>> sufficiently robust or reliable, a cluster of m1 instances was the de facto >>> standard. >>> >>> The canonical “best practice” in 2015 was i2. We believe we’ve made a >>> compelling argument to use m4 or c4 instead of i2. There exists a company >>> we know currently testing d2 at scale, though I’m not sure they have much >>> in terms of concrete results at this time. >>> >>> - Jeff >>> >>> From: Jack Krupansky >>> Reply-To: "user@cassandra.apache.org" >>> Date: Monday, February 1, 2016 at 1:55 PM >>> >>> To: "user@cassandra.apache.org" >>> Subject: Re: EC2 storage options for C* >>> >>> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2 >>> Dense Storage". >>> >>> The remaining question is whether any of the "Previous Generation >>> Instances" should be publicly recommended going forward. >>> >>> And whether non-SSD instances should be recommended going forward as >>> well. sure, technically, someone could use the legacy instances, but the >>> question is what we should be recommending as best practice going forward. >>> >>> Yeah, the i2 instances look like the sweet spot for any non-EBS clusters. >>> >>> -- Jack Krupansky >>> >>> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt <sroben...@highwire.org> >>> wrote: >>> >>>> Hi Jack, >>>> >>>> At the bottom of the instance-types page, there is a link to the >>>> previous generations, which includes the older series (m1, m2, etc), many >>>> of which have HDD options. >>>> >>>> There are also the d2 (Dense Storage) instances in the current >>>> generation that include various combos of local HDDs. >>>> >>>> The i2 series has good sized SSDs available, and has the advanced >>>> networking option, which is also useful for Cassandra. The enhanced >>>> networking is available with other instance types as well, as you'll see on >>>> the feature list under each type. >>>> >>>> Steve >>>> >>>> >>>> >>>> On Mon, Feb 1, 2016 at 1:17 PM, Jack Krupansky < >>>> jack.krupan...@gmail.com> wrote: >>>> >>>>> Thanks. Reading a little bit on AWS, and back to my SSD vs. magnetic >>>>> question, it seems like magnetic (HDD) is no longer a recommended storage >>>>> option for databases on AWS. In particular, only the C2 Dense Storage >>>>> instances have local magnetic storage - all the other instance types are >>>>> SSD or EBS-only - and EBS Magnetic is only recommended for "Infrequent >>>>> Data >>>>> Access." >>>>> >>>>> For the record, that AWS doc has Cassandra listed as a use case for i2 >>>>> instance types. >>>>> >>>>> Also, the AWS doc lists EBS io2 for the NoSQL database use case and >>>>> gp2 only for the "small to medium databases" use case. >>>>> >>>>> Do older instances with local HDD still exist on AWS (m1, m2, etc.)? >>>>> Is the doc simply for any newly started instances? >>>>> >>>>> See: >>>>> https://aws.amazon.com/ec2/instance-types/ >>>>> http://aws.amazon.com/ebs/details/ >>>>> >>>>> >>>>> -- Jack Krupansky >>>>> >>>>> On Mon, Feb 1, 2016 at 2:09 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com >>>>> > wrote: >>>>> >>>>>> > My apologies if my questions are actually answered on the video or >>>>>> slides, I just did a quick scan of the slide text. >>>>>> >>>>>> Virtually all of them are covered. >>>>>> >>>>>> > I'm curious where the EBS physical devices actually reside - are >>>>>> they in the same rack, the same data center, same availability zone? I >>>>>> mean, people try to minimize network latency between nodes, so how >>>>>> exactly >>>>>> is EBS able to avoid network latency? >>>>>> >>>>>> Not published,and probably not a straight forward answer (probably >>>>>> have redundancy cross-az, if it matches some of their other published >>>>>> behaviors). The promise they give you is ‘iops’, with a certain block >>>>>> size. >>>>>> Some instance types are optimized for dedicated, ebs-only network >>>>>> interfaces. Like most things in cassandra / cloud, the only way to know >>>>>> for >>>>>> sure is to test it yourself and see if observed latency is acceptable (or >>>>>> trust our testing, if you assume we’re sufficiently smart and honest). >>>>>> >>>>>> > Did your test use Amazon EBS–Optimized Instances? >>>>>> >>>>>> We tested dozens of instance type/size combinations (literally). The >>>>>> best performance was clearly with ebs-optimized instances that also have >>>>>> enhanced networking (c4, m4, etc) - slide 43 >>>>>> >>>>>> > SSD or magnetic or does it make any difference? >>>>>> >>>>>> SSD, GP2 (slide 64) >>>>>> >>>>>> > What info is available on EBS performance at peak times, when >>>>>> multiple AWS customers have spikes of demand? >>>>>> >>>>>> Not published, but experiments show that we can hit 10k iops all day >>>>>> every day with only trivial noisy neighbor problems, not enough to >>>>>> impact a >>>>>> real cluster (slide 58) >>>>>> >>>>>> > Is RAID much of a factor or help at all using EBS? >>>>>> >>>>>> You can use RAID to get higher IOPS than you’d normally get by >>>>>> default (GP2 IOPS cap is 10k, which you get with a 3.333T volume – if you >>>>>> need more than 10k, you can stripe volumes together up to the ebs network >>>>>> link max) (hinted at in slide 64) >>>>>> >>>>>> > How exactly is EBS provisioned in terms of its own HA - I mean, >>>>>> with a properly configured Cassandra cluster RF provides HA, so what is >>>>>> the >>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those >>>>>> three EBS volumes aren't all in the same physical rack? >>>>>> >>>>>> There is HA, I’m not sure that AWS publishes specifics. Occasionally >>>>>> specific volumes will have issues (hypervisor’s dedicated ethernet link >>>>>> to >>>>>> EBS network fails, for example). Occasionally instances will have issues. >>>>>> The volume-specific issues seem to be less common than the instance-store >>>>>> “instance retired” or “instance is running on degraded hardware” events. >>>>>> Stop/Start and you’ve recovered (possible with EBS, not possible with >>>>>> instance store). The assurances are in AWS’ SLA – if the SLA is >>>>>> insufficient (and it probably is insufficient), use more than one AZ >>>>>> and/or >>>>>> AWS region or cloud vendor. >>>>>> >>>>>> > For multi-data center operation, what configuration options assure >>>>>> that the EBS volumes for each DC are truly physically separated? >>>>>> >>>>>> It used to be true that EBS control plane for a given region spanned >>>>>> AZs. That’s no longer true. AWS asserts that failure modes for each AZ >>>>>> are >>>>>> isolated (data may replicate between AZs, but a full outage in us-east-1a >>>>>> shouldn’t affect running ebs volumes in us-east-1b or us-east-1c). Slide >>>>>> 65 >>>>>> >>>>>> > In terms of syncing data for the commit log, if the OS call to sync >>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at >>>>>> the >>>>>> hardware level on the EBS end, such that a power failure of the systems >>>>>> on >>>>>> which the EBS volumes reside will still guarantee availability of the >>>>>> fsynced data. As well, is return from fsync an absolute guarantee of >>>>>> sstable durability when Cassandra is about to delete the commit log, >>>>>> including when the two are on different volumes? In practice, we would >>>>>> like >>>>>> some significant degree of pipelining of data, such as during the full >>>>>> processing of flushing memtables, but for the fsync at the end a solid >>>>>> guarantee is needed. >>>>>> >>>>>> Most of the answers in this block are “probably not 100%, you should >>>>>> be writing to more than one host/AZ/DC/vendor to protect your >>>>>> organization >>>>>> from failures”. AWS targets something like 0.1% annual failure rate per >>>>>> volume and 99.999% availability (slide 66). We believe they’re exceeding >>>>>> those goals (at least based with the petabytes of data we have on gp2 >>>>>> volumes). >>>>>> >>>>>> >>>>>> >>>>>> From: Jack Krupansky >>>>>> Reply-To: "user@cassandra.apache.org" >>>>>> Date: Monday, February 1, 2016 at 5:51 AM >>>>>> >>>>>> To: "user@cassandra.apache.org" >>>>>> Subject: Re: EC2 storage options for C* >>>>>> >>>>>> I'm not a fan of guy - this appears to be the slideshare >>>>>> corresponding to the video: >>>>>> >>>>>> http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second >>>>>> >>>>>> My apologies if my questions are actually answered on the video or >>>>>> slides, I just did a quick scan of the slide text. >>>>>> >>>>>> I'm curious where the EBS physical devices actually reside - are they >>>>>> in the same rack, the same data center, same availability zone? I mean, >>>>>> people try to minimize network latency between nodes, so how exactly is >>>>>> EBS >>>>>> able to avoid network latency? >>>>>> >>>>>> Did your test use Amazon EBS–Optimized Instances? >>>>>> >>>>>> SSD or magnetic or does it make any difference? >>>>>> >>>>>> What info is available on EBS performance at peak times, when >>>>>> multiple AWS customers have spikes of demand? >>>>>> >>>>>> Is RAID much of a factor or help at all using EBS? >>>>>> >>>>>> How exactly is EBS provisioned in terms of its own HA - I mean, with >>>>>> a properly configured Cassandra cluster RF provides HA, so what is the >>>>>> equivalent for EBS? If I have RF=3, what assurance is there that those >>>>>> three EBS volumes aren't all in the same physical rack? >>>>>> >>>>>> For multi-data center operation, what configuration options assure >>>>>> that the EBS volumes for each DC are truly physically separated? >>>>>> >>>>>> In terms of syncing data for the commit log, if the OS call to sync >>>>>> an EBS volume returns, is the commit log data absolutely 100% synced at >>>>>> the >>>>>> hardware level on the EBS end, such that a power failure of the systems >>>>>> on >>>>>> which the EBS volumes reside will still guarantee availability of the >>>>>> fsynced data. As well, is return from fsync an absolute guarantee of >>>>>> sstable durability when Cassandra is about to delete the commit log, >>>>>> including when the two are on different volumes? In practice, we would >>>>>> like >>>>>> some significant degree of pipelining of data, such as during the full >>>>>> processing of flushing memtables, but for the fsync at the end a solid >>>>>> guarantee is needed. >>>>>> >>>>>> >>>>>> -- Jack Krupansky >>>>>> >>>>>> On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.pl...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Jeff, >>>>>>> >>>>>>> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not >>>>>>> discounting EBS, but prior outages are worrisome. >>>>>>> >>>>>>> >>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Free to choose what you'd like, but EBS outages were also addressed >>>>>>>> in that video (second half, discussion by Dennis Opacki). 2016 EBS >>>>>>>> isn't >>>>>>>> the same as 2011 EBS. >>>>>>>> >>>>>>>> -- >>>>>>>> Jeff Jirsa >>>>>>>> >>>>>>>> >>>>>>>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.pl...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Thank you all for the suggestions. I'm torn between GP2 vs >>>>>>>> Ephemeral. GP2 after testing is a viable contender for our workload. >>>>>>>> The >>>>>>>> only worry I have is EBS outages, which have happened. >>>>>>>> >>>>>>>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Also in that video - it's long but worth watching >>>>>>>>> >>>>>>>>> We tested up to 1M reads/second as well, blowing out page cache to >>>>>>>>> ensure we weren't "just" reading from memory >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jeff Jirsa >>>>>>>>> >>>>>>>>> >>>>>>>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky < >>>>>>>>> jack.krupan...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> How about reads? Any differences between read-intensive and >>>>>>>>> write-intensive workloads? >>>>>>>>> >>>>>>>>> -- Jack Krupansky >>>>>>>>> >>>>>>>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa < >>>>>>>>> jeff.ji...@crowdstrike.com> wrote: >>>>>>>>> >>>>>>>>>> Hi John, >>>>>>>>>> >>>>>>>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M >>>>>>>>>> writes per second on 60 nodes, we didn’t come close to hitting even >>>>>>>>>> 50% >>>>>>>>>> utilization (10k is more than enough for most workloads). PIOPS is >>>>>>>>>> not >>>>>>>>>> necessary. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> From: John Wong >>>>>>>>>> Reply-To: "user@cassandra.apache.org" >>>>>>>>>> Date: Saturday, January 30, 2016 at 3:07 PM >>>>>>>>>> To: "user@cassandra.apache.org" >>>>>>>>>> Subject: Re: EC2 storage options for C* >>>>>>>>>> >>>>>>>>>> For production I'd stick with ephemeral disks (aka instance >>>>>>>>>> storage) if you have running a lot of transaction. >>>>>>>>>> However, for regular small testing/qa cluster, or something you >>>>>>>>>> know you want to reload often, EBS is definitely good enough and we >>>>>>>>>> haven't >>>>>>>>>> had issues 99%. The 1% is kind of anomaly where we have flush >>>>>>>>>> blocked. >>>>>>>>>> >>>>>>>>>> But Jeff, kudo that you are able to use EBS. I didn't go through >>>>>>>>>> the video, do you actually use PIOPS or just standard GP2 in your >>>>>>>>>> production cluster? >>>>>>>>>> >>>>>>>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng < >>>>>>>>>> br...@blockcypher.com> wrote: >>>>>>>>>> >>>>>>>>>>> Yep, that motivated my question "Do you have any idea what kind >>>>>>>>>>> of disk performance you need?". If you need the performance, its >>>>>>>>>>> hard to >>>>>>>>>>> beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >>>>>>>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of >>>>>>>>>>> headache. >>>>>>>>>>> >>>>>>>>>>> Personally, on small clusters like ours (12 nodes), we've found >>>>>>>>>>> our choice of instance dictated much more by the balance of price, >>>>>>>>>>> CPU, and >>>>>>>>>>> memory. We're using GP2 SSD and we find that for our patterns the >>>>>>>>>>> disk is >>>>>>>>>>> rarely the bottleneck. YMMV, of course. >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa < >>>>>>>>>>> jeff.ji...@crowdstrike.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> If you have to ask that question, I strongly recommend m4 or c4 >>>>>>>>>>>> instances with GP2 EBS. When you don’t care about replacing a >>>>>>>>>>>> node because >>>>>>>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS >>>>>>>>>>>> is >>>>>>>>>>>> capable of amazing things, and greatly simplifies life. >>>>>>>>>>>> >>>>>>>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS >>>>>>>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s >>>>>>>>>>>> very much a viable option, despite any old documents online that >>>>>>>>>>>> say >>>>>>>>>>>> otherwise. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> From: Eric Plowe >>>>>>>>>>>> Reply-To: "user@cassandra.apache.org" >>>>>>>>>>>> Date: Friday, January 29, 2016 at 4:33 PM >>>>>>>>>>>> To: "user@cassandra.apache.org" >>>>>>>>>>>> Subject: EC2 storage options for C* >>>>>>>>>>>> >>>>>>>>>>>> My company is planning on rolling out a C* cluster in EC2. We >>>>>>>>>>>> are thinking about going with ephemeral SSDs. The question is >>>>>>>>>>>> this: Should >>>>>>>>>>>> we put two in RAID 0 or just go with one? We currently run a >>>>>>>>>>>> cluster in our >>>>>>>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are >>>>>>>>>>>> happy with >>>>>>>>>>>> the performance we are seeing thus far. >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>> Eric >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Steve Robenalt >>>> Software Architect >>>> sroben...@highwire.org <bza...@highwire.org> >>>> (office/cell): 916-505-1785 >>>> >>>> HighWire Press, Inc. >>>> 425 Broadway St, Redwood City, CA 94063 >>>> www.highwire.org >>>> >>>> Technology for Scholarly Communication >>>> >>> >>> >> -- > Ben Bromhead > CTO | Instaclustr > +1 650 284 9692 >