[Expired for neutron because there has been no activity for 60 days.] ** Changed in: neutron Status: Incomplete => Expired
-- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817548 Title: macvtap: possible race of interfaces scan/deletion at migration Status in neutron: Expired Bug description: A few days ago I stumbled upon a failed migration of VM with macvtap interface. Nova "joyed" me with error "Unsupported VIF type binding_failed". A short investigation lead me to the real problem: macvtap agent crushed 2019-02-21 18:40:38.636 54364 ERROR neutron.agent.linux.utils [req-362a8541-643e-4355-b3f7-a02fb9358b66 - - - - -] Exit code: 1; Stdin: ; Stdout: ; Stderr: Device "macvtap1" does not exist. 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service [req-362a8541-643e-4355-b3f7-a02fb9358b66 - - - - -] Error starting thread.: ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Device "macvtap1" does not exist. 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service Traceback (most recent call last): 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 729, in run_service 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service service.start() 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service result = f(*args, **kwargs) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 111, in start 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service self.daemon_loop() 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service result = f(*args, **kwargs) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 447, in daemon_loop 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service device_info = self.scan_devices(previous=device_info, sync=sync) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service result = f(*args, **kwargs) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 377, in scan_devices 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service current_devices = self.mgr.get_all_devices() 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/macvtap/agent/macvtap_neutron_agent.py", line 138, in get_all_devices 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service mac = ip_lib.get_device_mac(device_name) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 950, in get_device_mac 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service return IPDevice(device_name, namespace=namespace).link.address 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 541, in address 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service return self.attributes.get('link/ether') 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 565, in attributes 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service return self._parse_line(self._run(['o'], ('show', self.name))) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 360, in _run 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service return self._parent._run(options, self.COMMAND, args) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 88, in _run 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service return self._execute(options, command, args) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 102, in _execute 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service log_fail_as_error=self.log_fail_as_error) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 151, in execute 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service raise ProcessExecutionError(msg, returncode=returncode) 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Device "macvtap1" does not exist. 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service 2019-02-21 18:40:38.637 54364 ERROR oslo_service.service 2019-02-21 18:40:38.640 54364 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Macvtap agent agent. I have digg a bit and it looks like there's a race between interface deletion and a periodic scan in daemon_loop. For now get_all_devices (neutron/plugins/ml2/drivers/macvtap/agent/macvtap_neutron_agent.py +138) does'n check interface existence before get_device_mac: def get_all_devices(self): ............... for device_name in all_device_names: if device_name.startswith(constants.MACVTAP_DEVICE_PREFIX): if ip_lib.device_exists(device_name): mac = ip_lib.get_device_mac(device_name) self.mac_device_name_mappings[mac] = device_name devices.add(mac) It's quite possible that device will be deleted while we're looping over (in my case with ~10% probability). An obvious 'solution' is to check device on every loop but it doesn't really solve an issue though reduce race window so that I able to migrate VM for about 100 times before crush. I'll attache this patch anyway. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817548/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp