On 17 March 2014 17:54, John Garbutt <j...@johngarbutt.com> wrote: > On 15 March 2014 18:39, Chris Friesen <chris.frie...@windriver.com> wrote: >> Hi, >> >> I'm curious why the specified git commit chose to fix the anti-affinity race >> condition by aborting the boot and triggering a reschedule. >> >> It seems to me that it would have been more elegant for the scheduler to do >> a database transaction that would atomically check that the chosen host was >> not already part of the group, and then add the instance (with the chosen >> host) to the group. If the check fails then the scheduler could update the >> group_hosts list and reschedule. This would prevent the race condition in >> the first place rather than detecting it later and trying to work around it. >> >> This would require setting the "host" field in the instance at the time of >> scheduling rather than the time of instance creation, but that seems like it >> should work okay. Maybe I'm missing something though... > > We deal with memory races in the same way as this today, when they > race against the scheduler. > > Given the scheduler split, writing that value into the nova db from > the scheduler would be a step backwards, and it probably breaks lots > of code that assumes the host is not set until much later.
I forgot to mention, I am starting to be a fan of a two-phase commit approach, which could deal with these kinds of things in a more explicit way, before starting the main boot process. Its not as elegant as a database transaction, but that doesn't seems possible in the log run, but there could well be something I am missing here too. John _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev