Reviewed: https://review.opendev.org/c/openstack/neutron/+/939786 Committed: https://opendev.org/openstack/neutron/commit/87bdfd12a8984f54d27f287c547b3822c9fe3d91 Submitter: "Zuul (22348)" Branch: master
commit 87bdfd12a8984f54d27f287c547b3822c9fe3d91 Author: Bence Romsics <bence.roms...@gmail.com> Date: Wed Jan 22 13:03:49 2025 +0100 Do not assume the existence of a trunk bridge since os-vif may have deleted it Since https://review.opendev.org/c/openstack/neutron/+/837780 and https://review.opendev.org/c/openstack/os-vif/+/841499 it is possible that nova/os-vif already deleted the trunk bridge before we get to handle_trunk_remove() -> unwire_subports_for_trunk() -> _update_trunk_metadata(). In the past we assumed that nobody else deleted the trunk bridge before us and because of that when os-vif started deleting the trunk bridge we started producing error messages like: "Failed to store metadata for trunk ...: Cannot find Bridge with name=tbr-..." and "Error executing command (ListPortsCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-...: With this patch we no longer assume that the trunk bridge exists in _update_trunk_metadata(). Change-Id: I2b9202ddf653c0755c2aacb6b88ce8c983bb29dc Closes-Bug: #2095000 Related-Bug: #1869244 Related-Bug: #1997025 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2095000 Title: ovs-agent: Failed to store metadata for trunk ...: Cannot find Bridge with name=tbr-... Status in neutron: Fix Released Bug description: We have found a recurring error message and traceback in ovs-agent logs, for example: jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command [-] Error executing command (ListPortsCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1 jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last): jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 47, in execute jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command self.run_idl(None) jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/schema/open_vswitch/commands.py", line 303, in run_idl jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command br = idlutils.row_by_value(self.api.idl, 'Bridge', 'name', self.bridge) jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command raise RowNotFound(table=table, col=column, match=match) jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1 jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command· jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Failed to store metadata for trunk 78e94a46-1fe1-4b81-be34-3112c079a9eb: Cannot find Bridge with name=tbr-78e94a46-1: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1 We can reliably trigger this by: openstack network create tnet0 openstack subnet create --network tnet0 --subnet-range 10.0.100.0/24 tsubnet0 openstack port create --network tnet0 tport0 openstack network trunk create --parent-port tport0 trunk0 openstack server create --flavor cirros256 --image cirros-0.6.3-x86_64-disk --nic port-id=tport0 tvm0 --wait # Deleting the server triggers the error message: openstack server delete tvm0 --wait # Instead of delete, rebuild triggers it just as well: openstack server rebuild --image cirros-0.6.3-x86_64-disk tvm0 --wait We log the same error message from two places: wire_subports_for_trunk() and unwire_subports_for_trunk(). This one of course comes from unwiring: https://opendev.org/openstack/neutron/src/commit/585ea689d5d26356e28d8eb47f6d0511d21806cf/neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py#L321-L322 Clearly it can happen that neutron calls unwire...() when nova/os-vif already deleted the trunk bridge. For example see these logs from during a rebuild: $ sudo journalctl -o short-precise -u devstack@n-cpu.service -u devstack@q-agt.service -S -20m | egrep '(plugged vif|Failed to store metadata for trunk).*tbr-dc729e53-3' jan 14 14:05:38.558024 devstack0 nova-compute[1018]: INFO os_vif [None req-27af63bd-23d9-45d4-aa73-f70e357a38e0 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba') jan 14 14:05:48.878403 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully unplugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba') jan 14 14:05:49.285754 devstack0 neutron-openvswitch-agent[108540]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] unwire: Failed to store metadata for trunk dc729e53-3902-4ef7-a596-ce358456edc0: Cannot find Bridge with name=tbr-dc729e53-3: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-dc729e53-3 jan 14 14:05:51.197047 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba') My first guess is that this is an unintended side effect of the last proper fix for bug https://bugs.launchpad.net/neutron/+bug/1869244, that moved trunk bridge deletion from ovs-agent to os-vif. So far I did not find any resources leaked, so we probably only have a spurious error message we could suppress. devstack 2f3440dc neutron 8cca47f2e7 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2095000/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp