Re: [Openstack] Scheduler issues in folsom

2012-11-01 Thread Jonathan Proulx
On Wed, Oct 31, 2012 at 10:54 PM, Vishvananda Ishaya wrote: > My patch here seems to fix the issue in the one scheduler case: > > > https://github.com/vishvananda/nova/commit/2eaf796e60bd35319fe6add6dd04359546a21682 > > If you could give that a try on your scheduler node and see if it fixes it >

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Huang Zhiteng
Hi Vish, I like to idea to keep host states in memory (or external caching like memcached). This should fix the root cause why core filter doesn't work for Jonathan in his case, but for memory, I think we still need to find a way to handle those hypervisors don't allocated entire memory for guest

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Vishvananda Ishaya
On Oct 31, 2012, at 5:57 PM, Vishvananda Ishaya wrote: > > Looking at the code it appears that the relevent info is being sent down to > the compute node. That said I can't seem to repro your issue with even just > the ram filter. I can't get it to overallocate on one node unless I > specifi

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Vishvananda Ishaya
On Oct 31, 2012, at 1:44 PM, Jonathan Proulx wrote: > I'd only been pushing these options to the host the scheduler runs on, is it > that simple? I'm delight if I'm an an idiot and just need a few line in a > config file, but puzzled why this was (seemingly at least) working with > Essex, c

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Jonathan Proulx
On Wed, Oct 31, 2012 at 3:53 PM, Vishvananda Ishaya wrote: > > On Oct 31, 2012, at 12:25 PM, Jonathan Proulx wrote: > > > > again despite: > > scheduler_available_filters=nova.scheduler.filters.standard_filters > > > scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,ComputeFil

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Vishvananda Ishaya
On Oct 31, 2012, at 12:25 PM, Jonathan Proulx wrote: > > again despite: > scheduler_available_filters=nova.scheduler.filters.standard_filters > scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,ComputeFilter,RetryFilter > cpu_allocation_ratio=1.0 > ram_allocation_ratio=1.0 I

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Jonathan Proulx
On Wed, Oct 31, 2012 at 1:47 PM, Huang Zhiteng wrote: > Hi Jonathan, > > If I understand correctly, that bug is about multiple scheduler There is only a single process, I was reading it as relating to include threads within a single process, but they should clearly be able to serialize this withi

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Huang Zhiteng
Hi Jonathan, If I understand correctly, that bug is about multiple scheduler instances(processes) doing scheduler at the same time. When compute node found itself unable to fulfil a create_instance request, it'll resend the request back to scheduler (max_retry is to avoid endless retry). From you

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Jonathan Proulx
Hi All While the RetryScheduler may not have been designed specifically to fix this issue https://bugs.launchpad.net/nova/+bug/1011852 suggests that it is meant to fix it, well if "it" is a scheduler race condition which is my suspicion. This is my current scheduler config which gives the failure

Re: [Openstack] Scheduler issues in folsom

2012-10-31 Thread Daniel P. Berrange
On Wed, Oct 31, 2012 at 10:40:57AM +0800, Huang Zhiteng wrote: > On Wed, Oct 31, 2012 at 10:07 AM, Vishvananda Ishaya > wrote: > > > > On Oct 30, 2012, at 7:01 PM, Huang Zhiteng wrote: > > > >> I'd suggest the same ratio too. But besides memory overcommitment, I > >> suspect this issue is also r

Re: [Openstack] Scheduler issues in folsom

2012-10-30 Thread Huang Zhiteng
On Wed, Oct 31, 2012 at 10:07 AM, Vishvananda Ishaya wrote: > > On Oct 30, 2012, at 7:01 PM, Huang Zhiteng wrote: > >> I'd suggest the same ratio too. But besides memory overcommitment, I >> suspect this issue is also related to how KVM do memory allocation (it >> doesn't do actual allocation of

Re: [Openstack] Scheduler issues in folsom

2012-10-30 Thread Vishvananda Ishaya
On Oct 30, 2012, at 7:01 PM, Huang Zhiteng wrote: > I'd suggest the same ratio too. But besides memory overcommitment, I > suspect this issue is also related to how KVM do memory allocation (it > doesn't do actual allocation of the entire memory for guest when > booting). I've seen compute node

Re: [Openstack] Scheduler issues in folsom

2012-10-30 Thread Huang Zhiteng
On Wed, Oct 31, 2012 at 12:21 AM, Jonathan Proulx wrote: > Hi All, > > I'm having what I consider serious issues with teh scheduler in > Folsom. It seems to relate to the introdution of threading in the > scheduler. How many scheduler instances do you have? > > For a number of local reason we pre

Re: [Openstack] Scheduler issues in folsom

2012-10-30 Thread Huang Zhiteng
On Wed, Oct 31, 2012 at 6:55 AM, Vishvananda Ishaya wrote: > The retry scheduler is NOT meant to be a workaround for this. It sounds like > the ram filter is not working properly somehow. Have you changed the setting > for ram_allocation_ratio? It defaults to 1.5 allowing overallocation, but in >

Re: [Openstack] Scheduler issues in folsom

2012-10-30 Thread Vishvananda Ishaya
The retry scheduler is NOT meant to be a workaround for this. It sounds like the ram filter is not working properly somehow. Have you changed the setting for ram_allocation_ratio? It defaults to 1.5 allowing overallocation, but in your case you may want 1.0. I would be using the following two conf

[Openstack] Scheduler issues in folsom

2012-10-30 Thread Jonathan Proulx
Hi All, I'm having what I consider serious issues with teh scheduler in Folsom. It seems to relate to the introdution of threading in the scheduler. For a number of local reason we prefer to have instances start on the compute node with the least amount of free RAM that is still enough to satisf