Hello Matt, starting 1000 instances in production works for me already. We are on Openstack Newton. I described my configuration here: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/
If things blow up for you with hundreds, probably there is a regression somewhere. Thanks Saverio 2017-10-06 23:43 GMT+02:00 Matt Riedemann <mriede...@gmail.com>: > I've been chasing something weird I was seeing in devstack when creating > hundreds of instances in a single request where at some limit, things blow > up in an unexpected way during scheduling and all instances were put into > ERROR state. Given the environment I was running in, this shouldn't have > been happening, and today we figured out what was actually happening. To > summarize, we retry scheduling requests on RPC timeout so you can have > scheduler_max_attempts greenthreads running concurrently trying to schedule > 1000 instances and melt your scheduler. > > I've started a spec which goes into the details of the actual issue: > > https://review.openstack.org/#/c/510235/ > > It also proposes a solution, but I don't feel it's the greatest solution, so > there are also some alternatives in there. > > I'm really interested in operator feedback on this because I assume that > people are dealing with stuff like this in production already, and have had > to come up with ways to solve it. > > -- > > Thanks, > > Matt > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators