On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote:
I was wondering about how do you guys handle a large cluster (50+ machines).
Configuration management tools are awesome, until they aren't. Having used or played with all the popular ones, and having been bitten by failures of those tools on large clusters, my long-time preference has been using a VCS to check configs and scripts in/out and parallel ssh (whichever one you like). Simple is good. If you don't deeply understand the config management system you have chosen, the unexpected may(will?) eventually happen. To all the servers at once.
Even when you are careful, we are human. No tool can prevent *all* mistakes. Test everything in a staging environment, first!
-- Kind regards, Michael PS. even staging doesn't prevent fallibility.. :) https://twitter.com/mshuler/status/520667739615395840