Re: EC2 storage options for C*
Hi John, We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. From: John Wong Reply-To: "user@cassandra.apache.org" Date: Saturday, January 30, 2016 at 3:07 PM To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked. But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster? On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng wrote: Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course. On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa wrote: If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life. We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise. From: Eric Plowe Reply-To: "user@cassandra.apache.org" Date: Friday, January 29, 2016 at 4:33 PM To: "user@cassandra.apache.org" Subject: EC2 storage options for C* My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far. Thanks! Eric smime.p7s Description: S/MIME cryptographic signature
Re: EC2 storage options for C*
How about reads? Any differences between read-intensive and write-intensive workloads? -- Jack Krupansky On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa wrote: > Hi John, > > We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes > per second on 60 nodes, we didn’t come close to hitting even 50% > utilization (10k is more than enough for most workloads). PIOPS is not > necessary. > > > > From: John Wong > Reply-To: "user@cassandra.apache.org" > Date: Saturday, January 30, 2016 at 3:07 PM > To: "user@cassandra.apache.org" > Subject: Re: EC2 storage options for C* > > For production I'd stick with ephemeral disks (aka instance storage) if > you have running a lot of transaction. > However, for regular small testing/qa cluster, or something you know you > want to reload often, EBS is definitely good enough and we haven't had > issues 99%. The 1% is kind of anomaly where we have flush blocked. > > But Jeff, kudo that you are able to use EBS. I didn't go through the > video, do you actually use PIOPS or just standard GP2 in your production > cluster? > > On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng > wrote: > >> Yep, that motivated my question "Do you have any idea what kind of disk >> performance you need?". If you need the performance, its hard to beat >> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. >> >> Personally, on small clusters like ours (12 nodes), we've found our >> choice of instance dictated much more by the balance of price, CPU, and >> memory. We're using GP2 SSD and we find that for our patterns the disk is >> rarely the bottleneck. YMMV, of course. >> >> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa >> wrote: >> >>> If you have to ask that question, I strongly recommend m4 or c4 >>> instances with GP2 EBS. When you don’t care about replacing a node because >>> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is >>> capable of amazing things, and greatly simplifies life. >>> >>> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: >>> https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable >>> option, despite any old documents online that say otherwise. >>> >>> >>> >>> From: Eric Plowe >>> Reply-To: "user@cassandra.apache.org" >>> Date: Friday, January 29, 2016 at 4:33 PM >>> To: "user@cassandra.apache.org" >>> Subject: EC2 storage options for C* >>> >>> My company is planning on rolling out a C* cluster in EC2. We are >>> thinking about going with ephemeral SSDs. The question is this: Should we >>> put two in RAID 0 or just go with one? We currently run a cluster in our >>> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with >>> the performance we are seeing thus far. >>> >>> Thanks! >>> >>> Eric >>> >> >> >
Re: EC2 storage options for C*
Also in that video - it's long but worth watching We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory -- Jeff Jirsa > On Jan 31, 2016, at 9:52 AM, Jack Krupansky wrote: > > How about reads? Any differences between read-intensive and write-intensive > workloads? > > -- Jack Krupansky > >> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >> wrote: >> Hi John, >> >> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per >> second on 60 nodes, we didn’t come close to hitting even 50% utilization >> (10k is more than enough for most workloads). PIOPS is not necessary. >> >> >> >> From: John Wong >> Reply-To: "user@cassandra.apache.org" >> Date: Saturday, January 30, 2016 at 3:07 PM >> To: "user@cassandra.apache.org" >> Subject: Re: EC2 storage options for C* >> >> For production I'd stick with ephemeral disks (aka instance storage) if you >> have running a lot of transaction. >> However, for regular small testing/qa cluster, or something you know you >> want to reload often, EBS is definitely good enough and we haven't had >> issues 99%. The 1% is kind of anomaly where we have flush blocked. >> >> But Jeff, kudo that you are able to use EBS. I didn't go through the video, >> do you actually use PIOPS or just standard GP2 in your production cluster? >> >>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng wrote: >>> Yep, that motivated my question "Do you have any idea what kind of disk >>> performance you need?". If you need the performance, its hard to beat >>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. >>> >>> Personally, on small clusters like ours (12 nodes), we've found our choice >>> of instance dictated much more by the balance of price, CPU, and memory. >>> We're using GP2 SSD and we find that for our patterns the disk is rarely >>> the bottleneck. YMMV, of course. >>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa wrote: If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life. We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise. From: Eric Plowe Reply-To: "user@cassandra.apache.org" Date: Friday, January 29, 2016 at 4:33 PM To: "user@cassandra.apache.org" Subject: EC2 storage options for C* My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far. Thanks! Eric > smime.p7s Description: S/MIME cryptographic signature
Re: EC2 storage options for C*
Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 after testing is a viable contender for our workload. The only worry I have is EBS outages, which have happened. On Sunday, January 31, 2016, Jeff Jirsa wrote: > Also in that video - it's long but worth watching > > We tested up to 1M reads/second as well, blowing out page cache to ensure > we weren't "just" reading from memory > > > > -- > Jeff Jirsa > > > On Jan 31, 2016, at 9:52 AM, Jack Krupansky > wrote: > > How about reads? Any differences between read-intensive and > write-intensive workloads? > > -- Jack Krupansky > > On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa > wrote: > >> Hi John, >> >> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes >> per second on 60 nodes, we didn’t come close to hitting even 50% >> utilization (10k is more than enough for most workloads). PIOPS is not >> necessary. >> >> >> >> From: John Wong >> Reply-To: "user@cassandra.apache.org >> " >> Date: Saturday, January 30, 2016 at 3:07 PM >> To: "user@cassandra.apache.org >> " >> Subject: Re: EC2 storage options for C* >> >> For production I'd stick with ephemeral disks (aka instance storage) if >> you have running a lot of transaction. >> However, for regular small testing/qa cluster, or something you know you >> want to reload often, EBS is definitely good enough and we haven't had >> issues 99%. The 1% is kind of anomaly where we have flush blocked. >> >> But Jeff, kudo that you are able to use EBS. I didn't go through the >> video, do you actually use PIOPS or just standard GP2 in your production >> cluster? >> >> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng > > wrote: >> >>> Yep, that motivated my question "Do you have any idea what kind of disk >>> performance you need?". If you need the performance, its hard to beat >>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >>> configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. >>> >>> Personally, on small clusters like ours (12 nodes), we've found our >>> choice of instance dictated much more by the balance of price, CPU, and >>> memory. We're using GP2 SSD and we find that for our patterns the disk is >>> rarely the bottleneck. YMMV, of course. >>> >>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa >> > wrote: >>> If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life. We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise. From: Eric Plowe Reply-To: "user@cassandra.apache.org " Date: Friday, January 29, 2016 at 4:33 PM To: "user@cassandra.apache.org " Subject: EC2 storage options for C* My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far. Thanks! Eric >>> >>> >> >
Re: EC2 storage options for C*
Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. -- Jeff Jirsa > On Jan 31, 2016, at 8:27 PM, Eric Plowe wrote: > > Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 > after testing is a viable contender for our workload. The only worry I have > is EBS outages, which have happened. > >> On Sunday, January 31, 2016, Jeff Jirsa wrote: >> Also in that video - it's long but worth watching >> >> We tested up to 1M reads/second as well, blowing out page cache to ensure we >> weren't "just" reading from memory >> >> >> >> -- >> Jeff Jirsa >> >> >>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky >>> wrote: >>> >>> How about reads? Any differences between read-intensive and write-intensive >>> workloads? >>> >>> -- Jack Krupansky >>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa wrote: Hi John, We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. From: John Wong Reply-To: "user@cassandra.apache.org" Date: Saturday, January 30, 2016 at 3:07 PM To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked. But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster? > On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng > wrote: > Yep, that motivated my question "Do you have any idea what kind of disk > performance you need?". If you need the performance, its hard to beat > ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested > configuration. If you don't, though, EBS GP2 will save a _lot_ of > headache. > > Personally, on small clusters like ours (12 nodes), we've found our > choice of instance dictated much more by the balance of price, CPU, and > memory. We're using GP2 SSD and we find that for our patterns the disk is > rarely the bottleneck. YMMV, of course. > >> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa >> wrote: >> If you have to ask that question, I strongly recommend m4 or c4 >> instances with GP2 EBS. When you don’t care about replacing a node >> because of an instance failure, go with i2+ephemerals. Until then, GP2 >> EBS is capable of amazing things, and greatly simplifies life. >> >> We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: >> https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable >> option, despite any old documents online that say otherwise. >> >> >> >> From: Eric Plowe >> Reply-To: "user@cassandra.apache.org" >> Date: Friday, January 29, 2016 at 4:33 PM >> To: "user@cassandra.apache.org" >> Subject: EC2 storage options for C* >> >> My company is planning on rolling out a C* cluster in EC2. We are >> thinking about going with ephemeral SSDs. The question is this: Should >> we put two in RAID 0 or just go with one? We currently run a cluster in >> our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are >> happy with the performance we are seeing thus far. >> >> Thanks! >> >> Eric smime.p7s Description: S/MIME cryptographic signature
Re: EC2 storage options for C*
Jeff, If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting EBS, but prior outages are worrisome. On Sunday, January 31, 2016, Jeff Jirsa wrote: > Free to choose what you'd like, but EBS outages were also addressed in > that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the > same as 2011 EBS. > > -- > Jeff Jirsa > > > On Jan 31, 2016, at 8:27 PM, Eric Plowe > wrote: > > Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 > after testing is a viable contender for our workload. The only worry I have > is EBS outages, which have happened. > > On Sunday, January 31, 2016, Jeff Jirsa > wrote: > >> Also in that video - it's long but worth watching >> >> We tested up to 1M reads/second as well, blowing out page cache to ensure >> we weren't "just" reading from memory >> >> >> >> -- >> Jeff Jirsa >> >> >> On Jan 31, 2016, at 9:52 AM, Jack Krupansky >> wrote: >> >> How about reads? Any differences between read-intensive and >> write-intensive workloads? >> >> -- Jack Krupansky >> >> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >> wrote: >> >>> Hi John, >>> >>> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes >>> per second on 60 nodes, we didn’t come close to hitting even 50% >>> utilization (10k is more than enough for most workloads). PIOPS is not >>> necessary. >>> >>> >>> >>> From: John Wong >>> Reply-To: "user@cassandra.apache.org" >>> Date: Saturday, January 30, 2016 at 3:07 PM >>> To: "user@cassandra.apache.org" >>> Subject: Re: EC2 storage options for C* >>> >>> For production I'd stick with ephemeral disks (aka instance storage) if >>> you have running a lot of transaction. >>> However, for regular small testing/qa cluster, or something you know you >>> want to reload often, EBS is definitely good enough and we haven't had >>> issues 99%. The 1% is kind of anomaly where we have flush blocked. >>> >>> But Jeff, kudo that you are able to use EBS. I didn't go through the >>> video, do you actually use PIOPS or just standard GP2 in your production >>> cluster? >>> >>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng >>> wrote: >>> Yep, that motivated my question "Do you have any idea what kind of disk performance you need?". If you need the performance, its hard to beat ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested configuration. If you don't, though, EBS GP2 will save a _lot_ of headache. Personally, on small clusters like ours (12 nodes), we've found our choice of instance dictated much more by the balance of price, CPU, and memory. We're using GP2 SSD and we find that for our patterns the disk is rarely the bottleneck. YMMV, of course. On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa >>> > wrote: > If you have to ask that question, I strongly recommend m4 or c4 > instances with GP2 EBS. When you don’t care about replacing a node > because > of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is > capable of amazing things, and greatly simplifies life. > > We gave a talk on this topic at both Cassandra Summit and AWS > re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much > a viable option, despite any old documents online that say otherwise. > > > > From: Eric Plowe > Reply-To: "user@cassandra.apache.org" > Date: Friday, January 29, 2016 at 4:33 PM > To: "user@cassandra.apache.org" > Subject: EC2 storage options for C* > > My company is planning on rolling out a C* cluster in EC2. We are > thinking about going with ephemeral SSDs. The question is this: Should we > put two in RAID 0 or just go with one? We currently run a cluster in our > data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy > with > the performance we are seeing thus far. > > Thanks! > > Eric > >>> >>
Re: EC2 storage options for C*
Yes, but getting at why you think EBS is going down is the real point. New GM in 2011. Very different product. 35:40 in the video -- Jeff Jirsa > On Jan 31, 2016, at 9:57 PM, Eric Plowe wrote: > > Jeff, > > If EBS goes down, then EBS Gp2 will go down as well, no? I'm not discounting > EBS, but prior outages are worrisome. > >> On Sunday, January 31, 2016, Jeff Jirsa wrote: >> Free to choose what you'd like, but EBS outages were also addressed in that >> video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as >> 2011 EBS. >> >> -- >> Jeff Jirsa >> >> >>> On Jan 31, 2016, at 8:27 PM, Eric Plowe wrote: >>> >>> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 >>> after testing is a viable contender for our workload. The only worry I have >>> is EBS outages, which have happened. >>> On Sunday, January 31, 2016, Jeff Jirsa wrote: Also in that video - it's long but worth watching We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory -- Jeff Jirsa > On Jan 31, 2016, at 9:52 AM, Jack Krupansky > wrote: > > How about reads? Any differences between read-intensive and > write-intensive workloads? > > -- Jack Krupansky > >> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >> wrote: >> Hi John, >> >> We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes >> per second on 60 nodes, we didn’t come close to hitting even 50% >> utilization (10k is more than enough for most workloads). PIOPS is not >> necessary. >> >> >> >> From: John Wong >> Reply-To: "user@cassandra.apache.org" >> Date: Saturday, January 30, 2016 at 3:07 PM >> To: "user@cassandra.apache.org" >> Subject: Re: EC2 storage options for C* >> >> For production I'd stick with ephemeral disks (aka instance storage) if >> you have running a lot of transaction. >> However, for regular small testing/qa cluster, or something you know you >> want to reload often, EBS is definitely good enough and we haven't had >> issues 99%. The 1% is kind of anomaly where we have flush blocked. >> >> But Jeff, kudo that you are able to use EBS. I didn't go through the >> video, do you actually use PIOPS or just standard GP2 in your production >> cluster? >> >>> On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng >>> wrote: >>> Yep, that motivated my question "Do you have any idea what kind of disk >>> performance you need?". If you need the performance, its hard to beat >>> ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested >>> configuration. If you don't, though, EBS GP2 will save a _lot_ of >>> headache. >>> >>> Personally, on small clusters like ours (12 nodes), we've found our >>> choice of instance dictated much more by the balance of price, CPU, and >>> memory. We're using GP2 SSD and we find that for our patterns the disk >>> is rarely the bottleneck. YMMV, of course. >>> On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa wrote: If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life. We gave a talk on this topic at both Cassandra Summit and AWS re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very much a viable option, despite any old documents online that say otherwise. From: Eric Plowe Reply-To: "user@cassandra.apache.org" Date: Friday, January 29, 2016 at 4:33 PM To: "user@cassandra.apache.org" Subject: EC2 storage options for C* My company is planning on rolling out a C* cluster in EC2. We are thinking about going with ephemeral SSDs. The question is this: Should we put two in RAID 0 or just go with one? We currently run a cluster in our data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy with the performance we are seeing thus far. Thanks! Eric smime.p7s Description: S/MIME cryptographic signature
Re: EC2 storage options for C*
http://m.theregister.co.uk/2013/08/26/amazon_ebs_cloud_problems/ That's what I'm worried about. Granted that's an article from 2013, and While the the general purpose EBS volumes are performant for a production C* workload, I'm worried about EBS outages. If EBS is down, my cluster is down. On Monday, February 1, 2016, Jeff Jirsa wrote: > Yes, but getting at why you think EBS is going down is the real point. New > GM in 2011. Very different product. 35:40 in the video > > > -- > Jeff Jirsa > > > On Jan 31, 2016, at 9:57 PM, Eric Plowe > wrote: > > Jeff, > > If EBS goes down, then EBS Gp2 will go down as well, no? I'm not > discounting EBS, but prior outages are worrisome. > > On Sunday, January 31, 2016, Jeff Jirsa > wrote: > >> Free to choose what you'd like, but EBS outages were also addressed in >> that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the >> same as 2011 EBS. >> >> -- >> Jeff Jirsa >> >> >> On Jan 31, 2016, at 8:27 PM, Eric Plowe wrote: >> >> Thank you all for the suggestions. I'm torn between GP2 vs Ephemeral. GP2 >> after testing is a viable contender for our workload. The only worry I have >> is EBS outages, which have happened. >> >> On Sunday, January 31, 2016, Jeff Jirsa >> wrote: >> >>> Also in that video - it's long but worth watching >>> >>> We tested up to 1M reads/second as well, blowing out page cache to >>> ensure we weren't "just" reading from memory >>> >>> >>> >>> -- >>> Jeff Jirsa >>> >>> >>> On Jan 31, 2016, at 9:52 AM, Jack Krupansky >>> wrote: >>> >>> How about reads? Any differences between read-intensive and >>> write-intensive workloads? >>> >>> -- Jack Krupansky >>> >>> On Sun, Jan 31, 2016 at 3:13 AM, Jeff Jirsa >>> wrote: >>> Hi John, We run using 4T GP2 volumes, which guarantee 10k iops. Even at 1M writes per second on 60 nodes, we didn’t come close to hitting even 50% utilization (10k is more than enough for most workloads). PIOPS is not necessary. From: John Wong Reply-To: "user@cassandra.apache.org" Date: Saturday, January 30, 2016 at 3:07 PM To: "user@cassandra.apache.org" Subject: Re: EC2 storage options for C* For production I'd stick with ephemeral disks (aka instance storage) if you have running a lot of transaction. However, for regular small testing/qa cluster, or something you know you want to reload often, EBS is definitely good enough and we haven't had issues 99%. The 1% is kind of anomaly where we have flush blocked. But Jeff, kudo that you are able to use EBS. I didn't go through the video, do you actually use PIOPS or just standard GP2 in your production cluster? On Sat, Jan 30, 2016 at 1:28 PM, Bryan Cheng wrote: > Yep, that motivated my question "Do you have any idea what kind of > disk performance you need?". If you need the performance, its hard to beat > ephemeral SSD in RAID 0 on EC2, and its a solid, battle tested > configuration. If you don't, though, EBS GP2 will save a _lot_ of > headache. > > Personally, on small clusters like ours (12 nodes), we've found our > choice of instance dictated much more by the balance of price, CPU, and > memory. We're using GP2 SSD and we find that for our patterns the disk is > rarely the bottleneck. YMMV, of course. > > On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa < > jeff.ji...@crowdstrike.com> wrote: > >> If you have to ask that question, I strongly recommend m4 or c4 >> instances with GP2 EBS. When you don’t care about replacing a node >> because >> of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is >> capable of amazing things, and greatly simplifies life. >> >> We gave a talk on this topic at both Cassandra Summit and AWS >> re:Invent: https://www.youtube.com/watch?v=1R-mgOcOSd4 It’s very >> much a viable option, despite any old documents online that say >> otherwise. >> >> >> >> From: Eric Plowe >> Reply-To: "user@cassandra.apache.org" >> Date: Friday, January 29, 2016 at 4:33 PM >> To: "user@cassandra.apache.org" >> Subject: EC2 storage options for C* >> >> My company is planning on rolling out a C* cluster in EC2. We are >> thinking about going with ephemeral SSDs. The question is this: Should we >> put two in RAID 0 or just go with one? We currently run a cluster in our >> data center with 2 250gig Samsung 850 EVO's in RAID 0 and we are happy >> with >> the performance we are seeing thus far. >> >> Thanks! >> >> Eric >> > > >>>
Can't select count(*)
Hi all! I have a table with simple primary key (one field on primary key only), and ~1 million records. Table stored on single-node C* 2.2.4. Problem: when I'm trying to execute "SELECT count(*) FROM my_table;", operation is timed out. As I understand, 1 mln rows is not so big dataset to use MapRed to count it, so I thing it is smth wrong with configuration. Also I can't do "COPY TO/FROM" when dataset > 30 000 rows. Also maybe it is too weak hadrware (AWS EC2 t2.small), but even on t2.large I had timeout on COPY, just on 300 000 rows. My configuration is default config from deb package. Maybe somebody know what I should to tweak there? Thank you.