Public bug reported:

Description
===========
libvirt CPU power management does not support live migration

Steps to reproduce
==================
1. Turn on libvirt CPU power management
2. Boot an instance with hw:cpu_policy=dedicated
3. Live migrate the instance

Expected result
===============
Live migration succeeds.

Actual result
=============
Live migration fails with the following libvirt error in the source 
nova-compute logs:

[instance: afdd5e62-2a97-4b58-a7e7-bb92152f4165] Migration operation thread 
notification {{(pid=103809) thread_finished 
/opt/stack/nova/nova/virt/libvirt/driver.py:10668}}
Feb 21 19:21:15.045216 np0036828692 nova-compute[103809]: Traceback (most 
recent call last):
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/hubs/hub.py", line 
471, in fire_timers
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     timer()
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/hubs/timer.py", 
line 59, in __call__
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     cb(*args, **kw)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/event.py", line 
173, in _do_send
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
waiter.switch(result)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/greenthread.py", 
line 264, in main
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     result = 
function(*args, **kwargs)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/utils.py", line 664, in context_wrapper
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     return 
func(*args, **kwargs)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 10322, in 
_live_migration_operation
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     with 
excutils.save_and_reraise_exception():
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", 
line 227, in __exit__
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
self.force_reraise()
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", 
line 200, in force_reraise
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise self.value
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 10311, in 
_live_migration_operation
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
guest.migrate(self._live_migration_uri(dest),
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 648, in migrate
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
self._domain.migrateToURI3(
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
186, in doit
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     result = 
proxy_call(self._autowrap, f, *args, **kwargs)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
144, in proxy_call
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     rv = execute(f, 
*args, **kwargs)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
125, in execute
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise 
e.with_traceback(tb)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 82, 
in tworker
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     rv = meth(*args, 
**kwargs)
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/usr/lib/python3/dist-packages/libvirt.py", line 2126, in migrateToURI3
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise 
libvirtError('virDomainMigrateToURI3() failed')
Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]: libvirt.libvirtError: 
cannot set CPU affinity on process 48279: Invalid argument

Environment
===========
This was originally noticed in a whitebox CI job [1] on devstack master.

Additional info
===============
Regardless of whether NUMA live migration has changed the underlying CPU 
pinnings, it's necessary to make sure the cores are powered up on the 
destination, otherwise libvirt attempts to pin the instance to an offline core. 
Nova doesn't handle that. With some refactoring to the code itself, it's 
possible to observe the cores not being powered on in functional tests.

[1]
https://zuul.opendev.org/t/openstack/build/532b30767df54147a01508e7616930f5/logs

** Affects: nova
     Importance: Critical
         Status: In Progress

** Changed in: nova
   Importance: Undecided => Critical

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2056613

Title:
   libvirt CPU power management does not support live migration

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Description
  ===========
  libvirt CPU power management does not support live migration

  Steps to reproduce
  ==================
  1. Turn on libvirt CPU power management
  2. Boot an instance with hw:cpu_policy=dedicated
  3. Live migrate the instance

  Expected result
  ===============
  Live migration succeeds.

  Actual result
  =============
  Live migration fails with the following libvirt error in the source 
nova-compute logs:

  [instance: afdd5e62-2a97-4b58-a7e7-bb92152f4165] Migration operation thread 
notification {{(pid=103809) thread_finished 
/opt/stack/nova/nova/virt/libvirt/driver.py:10668}}
  Feb 21 19:21:15.045216 np0036828692 nova-compute[103809]: Traceback (most 
recent call last):
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/hubs/hub.py", line 
471, in fire_timers
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     timer()
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/hubs/timer.py", 
line 59, in __call__
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     cb(*args, **kw)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/event.py", line 
173, in _do_send
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
waiter.switch(result)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/greenthread.py", 
line 264, in main
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     result = 
function(*args, **kwargs)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/utils.py", line 664, in context_wrapper
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     return 
func(*args, **kwargs)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 10322, in 
_live_migration_operation
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     with 
excutils.save_and_reraise_exception():
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", 
line 227, in __exit__
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
self.force_reraise()
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", 
line 200, in force_reraise
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise self.value
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 10311, in 
_live_migration_operation
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
guest.migrate(self._live_migration_uri(dest),
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 648, in migrate
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     
self._domain.migrateToURI3(
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
186, in doit
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     result = 
proxy_call(self._autowrap, f, *args, **kwargs)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
144, in proxy_call
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     rv = execute(f, 
*args, **kwargs)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 
125, in execute
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise 
e.with_traceback(tb)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 82, 
in tworker
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     rv = 
meth(*args, **kwargs)
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:   File 
"/usr/lib/python3/dist-packages/libvirt.py", line 2126, in migrateToURI3
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]:     raise 
libvirtError('virDomainMigrateToURI3() failed')
  Feb 21 19:21:15.045387 np0036828692 nova-compute[103809]: 
libvirt.libvirtError: cannot set CPU affinity on process 48279: Invalid argument

  Environment
  ===========
  This was originally noticed in a whitebox CI job [1] on devstack master.

  Additional info
  ===============
  Regardless of whether NUMA live migration has changed the underlying CPU 
pinnings, it's necessary to make sure the cores are powered up on the 
destination, otherwise libvirt attempts to pin the instance to an offline core. 
Nova doesn't handle that. With some refactoring to the code itself, it's 
possible to observe the cores not being powered on in functional tests.

  [1]
  
https://zuul.opendev.org/t/openstack/build/532b30767df54147a01508e7616930f5/logs

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2056613/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to