There's nothing wrong with running a 3 node DC. A million writes an hour is averaging less than 300 writes a second, which is pretty trivial.
Are you running provisioned SSD EBS volumes or the traditional, awful ones? RF=2 with Quorum is kind of pointless, that's the same as CL=ALL. Not recommended. I don't know why your timeouts are happening, but when they do, RF=2 w/ QUORUM is going to make the problem worse. Either use RF=3 or use CL=ONE. Your management is correct here. Throwing more hardware at this problem is the wrong solution given that your current hardware should be able to handle over 100x what it's doing right now. Jon On Mon, Dec 26, 2016 at 1:28 PM Carlos Rolo <r...@pythian.com> wrote: > It depends on a lot of factors. > > What causes the cluster to get crazy? I/O, Network, CPU? > > I manage clusters of all sizes (even 3 nodes per DC) but it all depends on > usage and configuration. > > Regards, > > Carlos > > Regards, > > Carlos Juzarte Rolo > Cassandra Consultant / Datastax Certified Architect / Cassandra MVP > > Pythian - Love your data > > rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: > *linkedin.com/in/carlosjuzarterolo > <http://linkedin.com/in/carlosjuzarterolo>* > Mobile: +351 918 918 100 <+351%20918%20918%20100> > www.pythian.com > > On Mon, Dec 26, 2016 at 9:26 PM, Ney, Richard <richard....@aspect.com> > wrote: > > My company has a product we’re about to deploy into AWS with Cassandra > setup as a two 3 node clusters in two availability zones (m4.2xlarge with 2 > 500GB EBS volumes per node). We’re doing over a million writes per hour > with the cluster setup with R-2 and local quorum writes. We run > successfully for several hours before Cassandra goes into the weeds and we > start getting write timeouts to the point we must kill the Cassandra JVM > processes to get the Cassandra cluster to restart. I keep raising to my > upper management that the cluster is severely undersized but management is > complaining that setting up 12 nodes is too expensive and to change the > code to reduce load on Cassandra. > > > > So, the main question is “Is there any hope of success with a 3 node DC > setup of Cassandra in production or are we on a fool’s errand?” > > > > *RICHARD NEY* > > TECHNICAL DIRECTOR, RESEARCH & DEVELOPMENT > > *+1 (978) 848.6640 <+1%20978-848-6640>* WORK > > *+1 (916) 846.2353 <+1%20916-846-2353> *MOBILE > > *UNITED STATES* > > *richard....@aspect.com <richard....@aspect.com>* > > *aspect.com <http://www.aspect.com/>* > > > > [image: mailSigLogo-rev.jpg] > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. > > > > -- > > > >