Public bug reported: This defect is hard to reproduce, only happens when I have more than 3 compute node with DVR enabled.
With the following script, run several times, I can see one VM in ERROR state. neutron net-create demo-net netdemoid=$(neutron net-list | awk '{if($4=="'demo-net'"){print $2;}}') neutron subnet-create demo-net 10.100.100.0/24 --name demo-subnet subnetdemoid=$(neutron subnet-list | awk '{if($4=="'demo-subnet'"){print $2;}}') neutron router-create demo-router routerdemoid=$(neutron router-list | awk '{if($4=="'demo-router'"){print $2;}}') exnetid=$(neutron net-list | awk '{if($4=="'ext-net'"){print $2;}}') for i in `seq 1 10`; do #boot vm, and create floating ip nova boot --image cirros --flavor m1.tiny --nic net-id=$netdemoid cirrosdemo${i} cirrosdemoid[i]=$(nova list | awk '{if($4=="'cirrosdemo${i}'"){print $2;}}') output=$(neutron floatingip-create $exnetid) echo $output floatipid[i]=$(echo "$output" | awk '{if($2=="id"){print $4;}}') floatip[i]=$(echo "$output" | awk '{if($2=="floating_ip_address"){print $4;}}')a done # Setup router neutron router-gateway-set $routerdemoid $exnetid neutron router-interface-add demo-router $subnetdemoid #wait for VM to be running sleep 30 for i in `seq 1 10`; do cirrosfix=$(nova list | awk '{if($4=="'cirrosdemo${i}'"){print $12;}}') cirrosfixip=${cirrosfix#*=} output=$(neutron port-list | grep ${cirrosfixip}) echo $output portid=$(echo "$output" | awk '{print $2;}') neutron floatingip-associate --fixed-ip-address $cirrosfixip ${floatipid[i]} $portid neutron floatingip-delete ${floatipid[i]} nova delete ${cirrosdemoid[i]} done neutron router-interface-delete demo-router $subnetdemoid neutron router-gateway-clear demo-router $netdemoid neutron router-delete demo-router neutron subnet-delete $subnetdemoid neutron net-delete $netdemoid Looking at log file: 2014-11-20 17:25:56.258 31042 DEBUG neutron.openstack.common.lockutils [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] Got semaphore "db-access" lock /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/openstack/common/lockutils.py:168 2014-11-20 17:25:56.424 31042 ERROR neutron.api.v2.resource [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] delete failed 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Traceback (most recent call last): 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/resource.py", line 87, in resource 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource result = method(request=request, **args) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/base.py", line 476, in delete 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource obj_deleter(request.context, id, **kwargs) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 1036, in delete_port 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource router_info = l3plugin.dvr_deletens_if_no_vm(context, id) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/l3_dvrscheduler_db.py", line 195, in dvr_deletens_if_no_vm 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource port_host) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/agents_db.py", line 136, in _get_agent_by_type_and_host 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Agent.host == host).one() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2369, in one 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource ret = list(self) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2411, in _iter_ 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self.session._autoflush() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1198, in _autoflush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self.flush() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1919, in flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self._flush(objects) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource transaction.rollback(_capture_exception=True) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in _exit_ 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource compat.reraise(exc_type, exc_value, exc_tb) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2001, in _flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource flush_context.execute() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource rec.execute(self) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 526, in execute 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource uow 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 60, in save_obj 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource mapper, table, update) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 536, in _emit_update_statements 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource (table.description, len(update), rows)) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource StaleDataError: UPDATE statement on table 'ml2_dvr_port_bindings' expected to update 1 row(s); 0 were matched. 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource 2014-11-20 17:25:56.427 31042 INFO neutron.wsgi [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] 10.23.32.135 - - [20/Nov/2014 17:25:56] "DELETE /v2.0/ports/83f2d987-0ab7-4479-8c97-7ab7b758a2dc.json HTTP/1.1" 500 296 0.235463 The problem could be some race condition. The code l3_devscheduler_db.py >From line 158, got_dvr_routers_by_portid to line 193 >get_agent_by_type_and_host, there are several calls. If the port was there when code on 158. Another thread comes in the middle, delete the port. Line 193 will fail because the entry was not there anymore. Both "neutron floatingip-delete" and "nova delete" can delete the port. I think if we move l3_devscheduler_db.py, line 193: agent = self._get_agent_by_type_and_host(context, q_const.AGENT_TYPE_L3, port_host) to line 164 It should resolve the problem. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1398446 Title: Novacompute failed to detel VM port with DVR Status in OpenStack Neutron (virtual network service): New Bug description: This defect is hard to reproduce, only happens when I have more than 3 compute node with DVR enabled. With the following script, run several times, I can see one VM in ERROR state. neutron net-create demo-net netdemoid=$(neutron net-list | awk '{if($4=="'demo-net'"){print $2;}}') neutron subnet-create demo-net 10.100.100.0/24 --name demo-subnet subnetdemoid=$(neutron subnet-list | awk '{if($4=="'demo-subnet'"){print $2;}}') neutron router-create demo-router routerdemoid=$(neutron router-list | awk '{if($4=="'demo-router'"){print $2;}}') exnetid=$(neutron net-list | awk '{if($4=="'ext-net'"){print $2;}}') for i in `seq 1 10`; do #boot vm, and create floating ip nova boot --image cirros --flavor m1.tiny --nic net-id=$netdemoid cirrosdemo${i} cirrosdemoid[i]=$(nova list | awk '{if($4=="'cirrosdemo${i}'"){print $2;}}') output=$(neutron floatingip-create $exnetid) echo $output floatipid[i]=$(echo "$output" | awk '{if($2=="id"){print $4;}}') floatip[i]=$(echo "$output" | awk '{if($2=="floating_ip_address"){print $4;}}')a done # Setup router neutron router-gateway-set $routerdemoid $exnetid neutron router-interface-add demo-router $subnetdemoid #wait for VM to be running sleep 30 for i in `seq 1 10`; do cirrosfix=$(nova list | awk '{if($4=="'cirrosdemo${i}'"){print $12;}}') cirrosfixip=${cirrosfix#*=} output=$(neutron port-list | grep ${cirrosfixip}) echo $output portid=$(echo "$output" | awk '{print $2;}') neutron floatingip-associate --fixed-ip-address $cirrosfixip ${floatipid[i]} $portid neutron floatingip-delete ${floatipid[i]} nova delete ${cirrosdemoid[i]} done neutron router-interface-delete demo-router $subnetdemoid neutron router-gateway-clear demo-router $netdemoid neutron router-delete demo-router neutron subnet-delete $subnetdemoid neutron net-delete $netdemoid Looking at log file: 2014-11-20 17:25:56.258 31042 DEBUG neutron.openstack.common.lockutils [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] Got semaphore "db-access" lock /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/openstack/common/lockutils.py:168 2014-11-20 17:25:56.424 31042 ERROR neutron.api.v2.resource [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] delete failed 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Traceback (most recent call last): 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/resource.py", line 87, in resource 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource result = method(request=request, **args) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/base.py", line 476, in delete 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource obj_deleter(request.context, id, **kwargs) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 1036, in delete_port 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource router_info = l3plugin.dvr_deletens_if_no_vm(context, id) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/l3_dvrscheduler_db.py", line 195, in dvr_deletens_if_no_vm 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource port_host) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/agents_db.py", line 136, in _get_agent_by_type_and_host 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Agent.host == host).one() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2369, in one 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource ret = list(self) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2411, in _iter_ 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self.session._autoflush() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1198, in _autoflush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self.flush() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1919, in flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self._flush(objects) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource transaction.rollback(_capture_exception=True) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in _exit_ 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource compat.reraise(exc_type, exc_value, exc_tb) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2001, in _flush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource flush_context.execute() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource rec.execute(self) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 526, in execute 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource uow 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 60, in save_obj 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource mapper, table, update) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 536, in _emit_update_statements 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource (table.description, len(update), rows)) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource StaleDataError: UPDATE statement on table 'ml2_dvr_port_bindings' expected to update 1 row(s); 0 were matched. 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource 2014-11-20 17:25:56.427 31042 INFO neutron.wsgi [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] 10.23.32.135 - - [20/Nov/2014 17:25:56] "DELETE /v2.0/ports/83f2d987-0ab7-4479-8c97-7ab7b758a2dc.json HTTP/1.1" 500 296 0.235463 The problem could be some race condition. The code l3_devscheduler_db.py From line 158, got_dvr_routers_by_portid to line 193 get_agent_by_type_and_host, there are several calls. If the port was there when code on 158. Another thread comes in the middle, delete the port. Line 193 will fail because the entry was not there anymore. Both "neutron floatingip-delete" and "nova delete" can delete the port. I think if we move l3_devscheduler_db.py, line 193: agent = self._get_agent_by_type_and_host(context, q_const.AGENT_TYPE_L3, port_host) to line 164 It should resolve the problem. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1398446/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp