Re: [openstack-dev] [nova] question about e41fb84 "fix anti-affinity race condition on boot"

Chris Friesen Mon, 17 Mar 2014 16:42:08 -0700

On 03/17/2014 05:01 PM, Sylvain Bauza wrote:


There are 2 distinct cases :
1. there are multiple schedulers involved in the decision
2. there is one single scheduler but there is a race condition on it

About 1., I agree we need to see how the scheduler (and later on Gantt)
could address decision-making based on distributed engines. At least, I
consider the no-db scheduler blueprint responsible for using memcache
instead of a relational DB could help some of these issues, as memcached
can be distributed efficiently.

With a central database we could do a single atomic transaction thatlooks something like "select the first host A from list of hosts L thatis not in the list of hosts used by servers in group G and then set thehost field for server S to A". In that context simultaneous updatescan't happen because they're serialized by the central database.

How would one handle the above for simultaneous scheduling operationswithout a centralized data store? (I've never played with memcached, soI'm not really familiar with what it can do.)

About 2., that's a concurrency issue which can be addressed thanks to
common practices for synchronizing actions. IMHO, a local lock can be
enough for ensuring isolation

It's not that simple though. Currently the scheduler makes a decision,but the results of that decision aren't actually kept in the scheduleror written back to the db until much later when the instance is actuallyspawned on the compute node. So when the next scheduler request comesin we violate the scheduling policy. Local locking wouldn't help this.


Chris




_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] question about e41fb84 "fix anti-affinity race condition on boot"

Reply via email to