On Thu, 2013-12-12 at 15:22 +0100, Hugh O. Brock wrote: > On Thu, Dec 12, 2013 at 03:11:14PM +0100, Ladislav Smola wrote: > > Agree with this. > > > > Though I am an optimist, I believe that this time, we can avoid > > calling multiple services in one request that depend on each other. > > About the multiple users at once, this should be solved inside the > > API calls of the services. > > > > So I think we should forbid building these complex API calls > > composites in the Tuskar-API. If we will want something like this, > > we should implement > > it properly inside the services itself. If we will not be able to > > convince the community about it, maybe it's just not that good > > feature. :-D > > > > It's worth adding that in the particular case Radomir sites (the > "Deploy" button), even with all the locks in the world, the resources > that we have supposedly requisitioned in the undercloud for the user may > have already been allocated to someone else by Nova -- because Nova > currently doesn't allow reservation of resources. (There is work under > way to allow this but it is quite a way off.) So we could find ourselves > claiming for the user that we're going to deploy an overcloud at a > certain scale and then find ourselves unable to do so. > > Frankly I think the whole multi-user case for Tuskar is far enough off > that I would consider wrapping a single-login restriction around the > entire thing and calling it a day... except that that would be > crazy. I'm just trying to make the point that making these operations > really safe for multiple users is way harder than just putting a lock on > the tuskar API.
That's actually not that crazy, Hugh :) We've deployed more than a half dozen availability zones, and I've never seen anyone trample over each other trying to deploy OpenStack to the same set of bare-metal machines at the same time... it simply doesn't happen in the real world -- or at least, it would be so exceedingly rare that trying to deal with this kind of thing is more of an academic exercise than anything else. Instead of focusing on locking issues -- which I agree are very important in the virtualized side of things where resources are "thinner" -- I believe that in the bare-metal world, a more useful focus would be to ensure that the Tuskar API service treats related group operations (like "deploy an undercloud on these nodes") in a way that can handle failures in a graceful and/or atomic way. For example, if the construction or installation of one compute worker failed, adding some retry or retry-after-wait-for-event logic would be more useful than trying to put locks in a bunch of places to prevent multiple sysadmins from trying to deploy on the same bare-metal nodes (since it's just not gonna happen in the real world, and IMO, if it did happen, the sysadmins/deployers should be punished and have to clean up their own mess ;) Best, -jay _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev