On Wed, Oct 10, 2012 at 3:44 PM, Day, Phil <philip....@hp.com> wrote: > >> Per my understanding, this shouldn't happen no matter how (fast) you create >> instances since the requests are >> queued and scheduler updates resource information after it processes each >> request. The only possibility may cause >>the problem you met that I can think of is there are more than 1 scheduler >>doing scheduling. > > I think the new retry logic is meant to be safe even if there is more than > one scheduler, as the requests are effectively serialised when they get to > the compute manager, which can then reject any that break its actual resource > limits ? > Yes, but it seems Jonathan's filter list doesn't include RetryFilter, so it's possible that he ran into a race condition that RetryFilter targeted to solve.
> -----Original Message----- > From: openstack-bounces+philip.day=hp....@lists.launchpad.net > [mailto:openstack-bounces+philip.day=hp....@lists.launchpad.net] On Behalf Of > Huang Zhiteng > Sent: 10 October 2012 04:28 > To: Jonathan Proulx > Cc: openstack@lists.launchpad.net > Subject: Re: [Openstack] Folsom nova-scheduler race condition? > > On Tue, Oct 9, 2012 at 10:52 PM, Jonathan Proulx <j...@jonproulx.com> wrote: >> Hi All, >> >> Looking for a sanity test before I file a bug. I very recently >> upgraded my install to Folsom (on top of Ubuntu 12.04/kvm). My >> scheduler settings in nova.conf are: >> >> scheduler_available_filters=nova.scheduler.filters.standard_filters >> scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter, >> ComputeFilter >> least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost >> _fn >> compute_fill_first_cost_fn_weight=1.0 >> cpu_allocation_ratio=1.0 >> >> This had been working to fill systems based on available RAM and to >> not exceed 1:1 allocation ration of CPU resources with Essex. With >> Folsom, if I specify a moderately large number of instances to boot or >> spin up single instances in a tight shell loop they will all get >> schedule on the same compute node well in excess of the number of >> available vCPUs . If I start them one at a time (using --poll in a >> shell loop so each instance is started before the next launches) then >> I get the expected allocation behaviour. >> > Per my understanding, this shouldn't happen no matter how (fast) you create > instances since the requests are queued and scheduler updates resource > information after it processes each request. The only possibility may cause > the problem you met that I can think of is there are more than > 1 scheduler doing scheduling. >> I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to >> attempt to address this issue but as I read it that "fix" is based on >> retrying failures. Since KVM is capable of over committing both CPU >> and Memory I don't seem to get retryable failure, just really bad >> performance. >> >> Am I missing something this this fix or perhaps there's a reported bug >> I didn't find in my search, or is this really a bug no one has >> reported? >> >> Thanks, >> -Jon >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp > > > > -- > Regards > Huang Zhiteng > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp