This begins to sound like a hierarchical reservation system to me. Are 
databases even capable of doing this correctly?

If I was going to do something like this say in zookeeper it would appear that 
this is just a atomic write to paths to resources (using the concept of a 
zookeeper txn to ensure this write happens atomically). With gantt will there 
be read-db slaves, or just 1 database. Will there be some required hierarchal 
locking scheme (always lock in the same order) like zookeeper would require (to 
avoid deadlock)? If more than 1 db (master-master, master-slave?), how will 
this work? Forgive me for my limited DB knowledge, but I thought RDBMs used 
MVCC which means that the read could be different data than what is written (so 
the write will fail?). What about using something like raft,zookeeper,…

-Josh

From: Chris Friesen 
<chris.frie...@windriver.com<mailto:chris.frie...@windriver.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Monday, March 17, 2014 at 2:08 PM
To: 
"openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [nova] question about e41fb84 "fix anti-affinity 
race condition on boot"

On 03/17/2014 02:30 PM, Sylvain Bauza wrote:
There is a global concern here about how an holistic scheduler can
perform decisions, and from which key metrics.
The current effort is leading to having the Gantt DB updated thanks to
resource tracker for scheduling appropriately the hosts.

If we consider these metrics as not enough, ie. that Gantt should
perform an active check to another project, that's something which needs
to be considered carefully. IMHO, on that case, Gantt should only access
metrics thanks to the project REST API (and python client) in order to
make sure that rolling upgrades could happen.
tl;dr: If Gantt requires accessing Nova data, it should request Nova
REST API, and not perform database access directly (even thru the conductor)

Consider the case in point.

1) We create a server group with anti-affinity policy.  (So no two
instances in the group should run on the same compute node.)
2) We boot a server in this group.
3) Either simultaneously (on a different scheduler) or immediately after
(on the same scheduler) we boot another server in the same group.

Ideally the scheduler should enforce the policy without any races.
However, in the current code we don't update the instance entry in the
database with the chosen host until we actually try and create it on the
host.  Because of this we can end up putting both of them on the same
compute node.

Currently we only detect the problem when we go to actually boot the
instance on the compute node because we have a special-case check to
validate the policy.  Personally I think this is sort of a hack and it
would be better to detect the problem within the scheduler itself.

This is something that the scheduler should reasonably consider.  I see
it as effectively consuming resources, except that in this case the
resource is "the set of compute nodes not used by servers in the server
group".

Chris

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to