GitHub user wido opened a pull request:

    https://github.com/apache/cloudstack/pull/1408

    kvm: Aqcuire lock when running security group Python script

    It could happen that when multiple instances are starting at the same
    time on a KVM host the Agent spawns multiple instances of security_group.py
    which both try to modify iptables/ebtables rules.
    
    This fails with on of the two processes failing.
    
    The instance is still started, but it doesn't have any IP connectivity due
    to the failed programming of the security groups.
    
    This modification lets the script aqcuire a exclusive lock on a file so that
    only one instance of the scripts talks to iptables/ebtables at once.
    
    Other instances of the script which start will poll every 100ms if they can
    obtain the lock and otherwise execute anyway after 10 seconds.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wido/cloudstack security-group-lock

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cloudstack/pull/1408.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1408
    
----
commit 7e93f33d4a9cf6d2d036fc40c109c9e6e5fafac2
Author: Wido den Hollander <w...@widodh.nl>
Date:   2016-02-09T20:20:58Z

    kvm: Aqcuire lock when running security group Python script
    
    It could happen that when multiple instances are starting at the same
    time on a KVM host the Agent spawns multiple instances of security_group.py
    which both try to modify iptables/ebtables rules.
    
    This fails with on of the two processes failing.
    
    The instance is still started, but it doesn't have any IP connectivity due
    to the failed programming of the security groups.
    
    This modification lets the script aqcuire a exclusive lock on a file so that
    only one instance of the scripts talks to iptables/ebtables at once.
    
    Other instances of the script which start will poll every 100ms if they can
    obtain the lock and otherwise execute anyway after 10 seconds.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to