On Mon, Nov 18, 2013 at 4:47 PM, Joshua Harlow <harlo...@yahoo-inc.com>wrote:
> An idea related to this, what would need to be done to make the DB have > the exact state that a compute node is going through (and therefore the > scheduler would not make unreliable/racey decisions, even when there are > multiple schedulers). It's not like we are dealing with a system which can > not know the exact state (as long as the compute nodes are connected to the > network, and a network partition does not occur). > > Good question, I don't have a clear idea of the amount of work required to do this. > So maybe if we think about ways to correctly reserve resources, and keep > up to date information about reserved resources we could then eliminate the > race and eliminate the retries entirely? > What is the trade off here? What benefits do we get at what cost? I have a vague idea but just want to be explicit here. Also for 'cloudy' things we embrace the eventually consistent model, and I don't think we should drop that. > From: Joe Gordon <joe.gord...@gmail.com> > Reply-To: "OpenStack Development Mailing List (not for usage questions)" < > openstack-dev@lists.openstack.org> > Date: Monday, November 18, 2013 3:32 PM > > To: "OpenStack Development Mailing List (not for usage questions)" < > openstack-dev@lists.openstack.org> > Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL database? > > > > > On Mon, Nov 18, 2013 at 4:08 PM, yunhong jiang < > yunhong.ji...@linux.intel.com> wrote: > >> On Mon, 2013-11-18 at 14:09 -0800, Joe Gordon wrote: >> > >> > Phil Day discussed this at the summit and I have finally gotten around >> > to posting a POC of this. >> > >> > https://review.openstack.org/#/c/57053/ >> >> Hi, Joe, why you think the DB is not exact state in your followed commit >> message? I think the DB is updated to date by resource tracker, am I >> right (the resource tracker get the underlying resource information >> periodically but I think that information is mostly static). And I think >> the scheduler retry mainly comes from the race condition of multiple >> scheduler instance. >> > > > You answered the question yourself, the compute nodes (indirectly) > update the DB periodically, so the further you are from the last periodic > update the less up to date the DB is. > > Its there for both reasons. But yes it was originally put there because > of the multi scheduler race condition. > > >> >> "We already have the concept that the DB isn't the exact state of the >> world, right now it's updated every 10 seconds. And we use the scheduler >> retry mechanism to handle cases where the scheduler was wrong. " >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev