Oops... that was supposed to be "not a fan of video"! I have no problem with the guys in the video!
-- Jack Krupansky On Mon, Feb 1, 2016 at 8:51 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > I'm not a fan of guy - this appears to be the slideshare corresponding to > the video: > > http://www.slideshare.net/AmazonWebServices/bdt323-amazon-ebs-cassandra-1-million-writes-per-second > > My apologies if my questions are actually answered on the video or slides, > I just did a quick scan of the slide text. > > I'm curious where the EBS physical devices actually reside - are they in > the same rack, the same data center, same availability zone? I mean, people > try to minimize network latency between nodes, so how exactly is EBS able > to avoid network latency? > > Did your test use Amazon EBS–Optimized Instances? > > SSD or magnetic or does it make any difference? > > What info is available on EBS performance at peak times, when multiple AWS > customers have spikes of demand? > > Is RAID much of a factor or help at all using EBS? > > How exactly is EBS provisioned in terms of its own HA - I mean, with a > properly configured Cassandra cluster RF provides HA, so what is the > equivalent for EBS? If I have RF=3, what assurance is there that those > three EBS volumes aren't all in the same physical rack? > > For multi-data center operation, what configuration options assure that > the EBS volumes for each DC are truly physically separated? > > In terms of syncing data for the commit log, if the OS call to sync an EBS > volume returns, is the commit log data absolutely 100% synced at the > hardware level on the EBS end, such that a power failure of the systems on > which the EBS volumes reside will still guarantee availability of the > fsynced data. As well, is return from fsync an absolute guarantee of > sstable durability when Cassandra is about to delete the commit log, > including when the two are on different volumes? In practice, we would like > some significant degree of pipelining of data, such as during the full > processing of flushing memtables, but for the fsync at the end a solid > guarantee is needed. > > > -- Jack Krupansky > > On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.pl...@gmail.com> wrote: > >> Jeff, >> >> If EBS goes down, then EBS Gp2 will go down as well, no? I'm not >> discounting EBS, but prior outages are worrisome. >> >> >> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com> >> wrote: >> >>> Free to choose what you'd like, but EBS outages were also addressed in >>> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the >>> same as 2011 EBS. >>> >>> -- >>> Jeff Jirsa >>> >>> >>> On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.pl...@gmail.com> wrote: >>> >>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. >>> GP2 after testing is a viable contender for our workload. The only worry I >>> have is EBS outages, which have happened. >>> >>> On Sunday, January 31, 2016, Jeff Jirsa <jeff.ji...@crowdstrike.com> >>> wrote: >>> >>>> Also in that video - it's long but worth watching >>>> >>>> We tested up to 1M reads/second as well, blowing out page cache to >>>> ensure we weren't "just" reading from memory >>>> >>>> >>>> >>>> -- >>>> Jeff Jirsa >>>> >>>> >>>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky <jack.krupan...@gmail.com> >>>> wrote: >>>> >>>> How about reads? Any differences between read-intensive and >>>> write-intensive workloads? >>>> >>>> -- Jack Krupansky >>>> >>>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com >>>> > wrote: >>>> >>>>> Hi John, >>>>> >>>>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M >>>>> writes per second on 60 nodes, we didn’t come close to hitting even 50% >>>>> utilization (10k is more than enough for most workloads). PIOPS is not >>>>> necessary. >>>>> >>>>> >>>>> >>>>> From: John Wong >>>>> Reply-To: "user@cassandra.apache.org" >>>>> Date: Saturday, January 30, 2016 at 3:07 PM >>>>> To: "user@cassandra.apache.org" >>>>> Subject: Re: EC2 storage options for C* >>>>> >>>>> For production I'd stick with ephemeral disks (aka instance storage) >>>>> if you have running a lot of transaction. >>>>> However, for regular small testing/qa cluster, or something you know >>>>> you want to reload often, EBS is definitely good enough and we haven't had >>>>> issues 99%. The 1% is kind of anomaly where we have flush blocked. >>>>> >>>>> But Jeff, kudo that you are able to use EBS. I didn't go through the >>>>> video, do you actually use PIOPS or just standard GP2 in your production >>>>> cluster? >>>>> >>>>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng <br...@blockcypher.com> >>>>> wrote: >>>>> >>>>>> Yep, that motivated my question "Do you have any idea what kind of >>>>>> disk performance you need?". If you need the performance, its hard to >>>>>> beat >>>>>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >>>>>> configuration. If you don't, though, EBS GP2 will save a _lot_ of >>>>>> headache. >>>>>> >>>>>> Personally, on small clusters like ours (12 nodes), we've found our >>>>>> choice of instance dictated much more by the balance of price, CPU, and >>>>>> memory. We're using GP2 SSD and we find that for our patterns the disk is >>>>>> rarely the bottleneck. YMMV, of course. >>>>>> >>>>>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa < >>>>>> jeff.ji...@crowdstrike.com> wrote: >>>>>> >>>>>>> If you have to ask that question, I strongly recommend m4 or c4 >>>>>>> instances with GP2 EBS. When you don’t care about replacing a node >>>>>>> because >>>>>>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is >>>>>>> capable of amazing things, and greatly simplifies life. >>>>>>> >>>>>>> We gave a talk on this topic at both Cassandra Summit and AWS >>>>>>> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very >>>>>>> much a viable option, despite any old documents online that say >>>>>>> otherwise. >>>>>>> >>>>>>> >>>>>>> >>>>>>> From: Eric Plowe >>>>>>> Reply-To: "user@cassandra.apache.org" >>>>>>> Date: Friday, January 29, 2016 at 4:33 PM >>>>>>> To: "user@cassandra.apache.org" >>>>>>> Subject: EC2 storage options for C* >>>>>>> >>>>>>> My company is planning on rolling out a C* cluster in EC2. We are >>>>>>> thinking about going with ephemeral SSDs. The question is this: Should >>>>>>> we >>>>>>> put two in RAID 0 or just go with one? We currently run a cluster in our >>>>>>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy >>>>>>> with >>>>>>> the performance we are seeing thus far. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Eric >>>>>>> >>>>>> >>>>>> >>>>> >>>> >