Re: [Openstack] Folsom nova-scheduler race condition?

Huang Zhiteng Wed, 10 Oct 2012 01:34:02 -0700

On Wed, Oct 10, 2012 at 3:44 PM, Day, Phil <philip....@hp.com> wrote:
>
>> Per my understanding, this shouldn't happen no matter how (fast) you create 
>> instances since the requests are
>> queued and scheduler updates resource information after it processes each 
>> request.  The only possibility may cause
>>the problem you met that I can think of is there are more than 1 scheduler 
>>doing scheduling.
>
> I think the new retry logic is meant to be safe even if there is more than 
> one scheduler, as the requests are effectively serialised when they get to 
> the compute manager, which can then reject any that break its actual resource 
> limits ?
>
Yes, but it seems Jonathan's filter list doesn't include RetryFilter,
so it's possible that he ran into a race condition that RetryFilter
targeted to solve.


> -----Original Message-----
> From: openstack-bounces+philip.day=hp....@lists.launchpad.net 
> [mailto:openstack-bounces+philip.day=hp....@lists.launchpad.net] On Behalf Of 
> Huang Zhiteng
> Sent: 10 October 2012 04:28
> To: Jonathan Proulx
> Cc: openstack@lists.launchpad.net
> Subject: Re: [Openstack] Folsom nova-scheduler race condition?
>
> On Tue, Oct 9, 2012 at 10:52 PM, Jonathan Proulx <j...@jonproulx.com> wrote:
>> Hi All,
>>
>> Looking for a sanity test before I file a bug.  I very recently
>> upgraded my install to Folsom (on top of Ubuntu 12.04/kvm).  My
>> scheduler settings in nova.conf are:
>>
>> scheduler_available_filters=nova.scheduler.filters.standard_filters
>> scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,
>> ComputeFilter
>> least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost
>> _fn
>> compute_fill_first_cost_fn_weight=1.0
>> cpu_allocation_ratio=1.0
>>
>> This had been working to fill systems based on available RAM and to
>> not exceed 1:1 allocation ration of CPU resources with Essex.  With
>> Folsom, if I specify a moderately large number of instances to boot or
>> spin up single instances in a tight shell loop they will all get
>> schedule on the same compute node well in excess of the number of
>> available vCPUs . If I start them one at a time (using --poll in a
>> shell loop so each instance is started before the next launches) then
>> I get the expected allocation behaviour.
>>
> Per my understanding, this shouldn't happen no matter how (fast) you create 
> instances since the requests are queued and scheduler updates resource 
> information after it processes each request.  The only possibility may cause 
> the problem you met that I can think of is there are more than
>  1 scheduler doing scheduling.
>> I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to
>> attempt to address this issue but as I read it that "fix" is based on
>> retrying failures.  Since KVM is capable of over committing both CPU
>> and Memory I don't seem to get retryable failure, just really bad
>> performance.
>>
>> Am I missing something this this fix or perhaps there's a reported bug
>> I didn't find in my search, or is this really a bug no one has
>> reported?
>>
>> Thanks,
>> -Jon
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>
>
> --
> Regards
> Huang Zhiteng
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp



-- 
Regards
Huang Zhiteng

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Re: [Openstack] Folsom nova-scheduler race condition?

Reply via email to