Public bug reported: Description ----------- When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately.
Steps to reproduce ------------------ 1. Disconnect the network for the rbd storage. 2. Schedule `domstats` to run every 10 seconds. Expected result --------------- The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions. ** Affects: nova Importance: Undecided Assignee: Yalei Li (chetaiyong) Status: New ** Changed in: nova Assignee: (unassigned) => Yalei Li (chetaiyong) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2048848 Title: get_power_state blocked Status in OpenStack Compute (nova): New Bug description: Description ----------- When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately. Steps to reproduce ------------------ 1. Disconnect the network for the rbd storage. 2. Schedule `domstats` to run every 10 seconds. Expected result --------------- The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2048848/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp