Hi Joe,
> I am doing resource planning and could use some help. I have been working alone 4 years operating a growing cluster (from 3 to 60+ nodes, from t1.micro instances to I2.2xlarge AWS instances), on the biggest cluster, +2 other clusters and handling MySQL too :'(. I now joined a team of Cassandra experts, so I worked in the 2 extremes situations. So, first thing, it is doable, a guy alone can do this. And I probably could have handled more nodes (+ I was doing MySQL schema management for new features). Second thing, it is a real PITA for one guy to be on its own operating a production Cassandra cluster, really. I would say that having a guy working alone is a bad idea. First because when someone is digging a Cassandra issue, an external point of view is often very enlightening. It is a complex system and discussing possible solutions is often worth it. Here the community can help (here, IRC, ...). Also, anytime your operational guy is out, who will handle operations? What if your operator leaves your company? This can happen as a lot of people want to recruit a good Cassandra operator. My point is using Replication Factor of 2 or 3 for data (with extra cost induced), and of 1 for people will produce a 'sort of Single Point of Failure' in Cassandra. Cassandra often need to be 100% up (or close to it), what happens if Cassandra start failing during your operator's 3 week long holidays? I worked during nights, holidays, Christmas, ... in the past, If we would have been 2 of us, I would have done half (roughly) of the work, and not during my time off. I might have then stayed longer in my previous company. Be careful, having only one operator on a Cassandra cluster will probably exhaust him quite quickly. >From my own experience, I would say you should probably have a second person as soon as you can (they can do something else than Cassandra half of the time if needed at start). But I truly believe 2 people knowing and able to act on Cassandra is good number to reach asap. If you don't want to do it, at least make sure to have some other people in your team able to do support of first level (restart nodes, monitor, understanding how Cassandra work roughly, apply commands given by the operator - have him preparing common troubleshooting) and/or consider using external support when your operator is blocked, as he probably won't be able to answer everything on his own. Data is the beating heart of many businesses, and it is still often under-provisioned (machine and people) as it is a cost, with no direct income. Think about how much data availability / consistency / latency is important to keep in a good state in your case and act accordingly :-). Then, when there is a team of 2, you don't need to scale according to the number of nodes (Netflix C* team use to be 2 or 3 people and they had 1000+ servers if I remember correctly). The whole things is making sure operators can, and are encouraged to, automate common actions, script things as much as they think it is useful. Then a few people can handle a lot of node, it is far from being linearly related to the number of node. How many operations people will I need to manage my Cassandra > implementation for two sites with 10 nodes at each site? As, my cluster > grows at what point will I need to add another person? I would finally say that the number of operator needed might actually be more related to the number of devs / tech team member you have. My team had 60 Devs, and me alone operating Cassandra, which I believe is ridiculous and I don't recommend. More dev = more features, more modeling work, more services hitting the database, etc. It also depends on the management / automation systems in place: basically, if adding a node is a 5 min operation for this guy or a 2 hours time operation (just to have the node prepare), you obviously don't need the same amount of people there. FWIW, here is a post I wrote that I believe might help your operator handling your small cluster: http://thelastpickle.com/blog/2016/03/21/running-commands-cluster-wide.html Those are only personal thoughts and consideration due to my own experience. Other might have other considerations, maybe see things from other perspective than the Cassandra operator one (which is mine here). I hope you will be kind to your operator and find him a friend to talk with! I think it is better for both the company and for him. C*heers, ----------------------- Alain Rodriguez - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-04-01 7:41 GMT+02:00 Joe Hicks <joehi...@gmail.com>: > I am doing resource planning and could use some help. How many operations > people will I need to manage my Cassandra implementation for two sites with > 10 nodes at each site? As, my cluster grows at what point will I need to > add another person? >