An idea, make the lock more granular.

Instead of @utils.synchronized('any-name') I wonder if u could do something 
like.

with utils.synchronized('any-name-$device-id'):
# Code here

Then at least u won't be locking at the method level (which means no 
concurrency). Would that work?

From: Edgar Magana <emag...@plumgrid.com<mailto:emag...@plumgrid.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Monday, November 18, 2013 12:25 PM
To: OpenStack List 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] [Neutron] Race condition between DB layer and plugin 
back-end implementation

Developers,

This topic has been discussed before but I do not remember if we have a good 
solution or not.
Basically, if concurrent API calls are sent to Neutron, all of them are sent to 
the plug-in level where two actions have to be made:

1. DB transaction – No just for data persistence but also to collect the 
information needed for the next action
2. Plug-in back-end implementation – In our case is a call to the python 
library than consequentially calls PLUMgrid REST GW (soon SAL)

For instance:

def create_port(self, context, port):
        with context.session.begin(subtransactions=True):
            # Plugin DB - Port Create and Return port
            port_db = super(NeutronPluginPLUMgridV2, self).create_port(context,
                                                                       port)
            device_id = port_db["device_id"]
            if port_db["device_owner"] == "network:router_gateway":
                router_db = self._get_router(context, device_id)
            else:
                router_db = None
            try:
                LOG.debug(_("PLUMgrid Library: create_port() called"))
# Back-end implementation
                self._plumlib.create_port(port_db, router_db)
            except Exception:
            …

The way we have implemented at the plugin-level in Havana (even in Grizzly) is 
that both action are wrapped in the same "transaction" which automatically 
rolls back any operation done to its original state protecting mostly the DB of 
having any inconsistency state or left over data if the back-end part fails.=.
The problem that we are experiencing is when concurrent calls to the same API 
are sent, the number of operation at the plug-in back-end are long enough to 
make the next concurrent API call to get stuck at the DB transaction level, 
which creates a hung state for the Neutron Server to the point that all 
concurrent API calls will fail.

This can be fixed if we include some "locking" system such as calling:

from neutron.common import utile
…

@utils.synchronized('any-name', external=True)
def create_port(self, context, port):
…

Obviously, this will create a serialization of all concurrent calls which will 
ends up in having a really bad performance. Does anyone has a better solution?

Thanks,

Edgar
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to