Le 09/01/2015 09:01, Alex Xu a écrit :
Hi, All
There is bug when running nova with ironic
https://bugs.launchpad.net/nova/+bug/1402658
The case is simple: one baremetal node with 1024MB ram, then boot two
instances with 512MB ram flavor.
Those two instances will be scheduling to same baremetal node.
The problem is at scheduler side the IronicHostManager will consume
all the resources for that node whatever
how much resource the instance used. But at compute node side, the
ResourceTracker won't consume resources
like that, just consume like normal virtual instance. And
ResourceTracker will update the resource usage once the
instance resource claimed, then scheduler will know there are some
free resource on that node, then will try to
schedule other new instance to that node.
I take look at that, there is NumInstanceFilter, it will limit how
many instance can schedule to one host. So can
we just use this filter to finish the goal? The max instance is
configured by option 'max_instances_per_host', we
can make the virt driver to report how many instances it supported.
The ironic driver can just report max_instances_per_host=1.
And libvirt driver can report max_instance_per_host=-1, that means no
limit. And then we can just remove the
IronicHostManager, then make the scheduler side is more simpler. Does
make sense? or there are more trap?
Thanks in advance for any feedback and suggestion.
Mmm, I think I disagree with your proposal. Let me explain by the best I
can why :
tl;dr: Any proposal unless claiming at the scheduler level tends to be wrong
The ResourceTracker should be only a module for providing stats about
compute nodes to the Scheduler.
How the Scheduler is consuming these resources for making a decision
should only be a Scheduler thing.
Here, the problem is that the decision making is also shared with the
ResourceTracker because of the claiming system managed by the context
manager when booting an instance. It means that we have 2 distinct
decision makers for validating a resource.
Let's stop to be realistic for a moment and discuss about what could
mean a decision for something else than a compute node. Ok, let say a
volume.
Provided that *something* would report the volume statistics to the
Scheduler, that would be the Scheduler which would manage if a volume
manager could accept a volume request. There is no sense to validate the
decision of the Scheduler on the volume manager, just maybe doing some
error management.
We know that the current model is kinda racy with Ironic because there
is a 2-stage validation (see [1]). I'm not in favor of complexifying the
model, but rather put all the claiming logic in the scheduler, which is
a longer path to win, but a safier one.
-Sylvain
[1] https://bugs.launchpad.net/nova/+bug/1341420
Thanks
Alex
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev