I did a presentation on diagnosing performance problems in production at the US & Euro summits, in which I covered quite a few tools & preventative measures you should know when running a production cluster. You may find it useful: http://rustyrazorblade.com/2014/09/cassandra-summit-recap-diagnosing-problems-in-production/
On ops center - I recommend it. It gives you a nice dashboard. I don't think it's completely comprehensive (but no tool really is) but it gets you 90% of the way there. It's a good idea to run repairs, especially if you're doing deletes or querying at CL=ONE. I assume you're not using quorum, because on RF=2 that's the same as CL=ALL. I recommend at least RF=3 because if you lose 1 server, you're on the edge of data loss. On Tue Dec 09 2014 at 7:19:32 PM Neha Trivedi <nehajtriv...@gmail.com> wrote: > Hi, > We have Two Node Cluster Configuration in production with RF=2. > > Which means that the data is written in both the clusters and it's running > for about a month now and has good amount of data. > > Questions? > 1. What are the best practices for maintenance? > 2. Is OPScenter required to be installed or I can manage with nodetool > utility? > 3. Is is necessary to run repair weekly? > > thanks > regards > Neha >