Public bug reported: Becasue the spectre/meltdown vulnerabilities (2018) we needed to disable SMT in all public facing compute nodes. As result the number of available cores was reduced by half.
We had flavors available with 32vCPUs that couldn't be used anymore because placement max_unit for vCPUs is hardcoded to be the total number of cpus regardless the allocation_ratio. To me it's a sensible default but doesn't offer any flexibility for operators. See the IRC discussion at that time: http://eavesdrop.openstack.org/irclogs/%23openstack-placement/%23openstack-placement.2018-09-20.log.html As conclusion, we informed the users that we couldn't offer those flavors anymore. The old VMs (that were created before disabling SMT) continued to run without any issue. So... after ~2 year I'm hitting again this problem :) These compute nodes need now to be retired and we are live migrating all the instances to the replacement hardware. When trying to live migrate these instances (vCPUs > max_unit) it fails, becasue the migration allocation can't be created against the source compute node. For the new hardware (dest_compute) the vCPUS < max_unit, so no issue for the new allocation. I'm working around this problem (to live migrate the instances), patching the code to have a higher max_unit for vCPUs in the compute nodes hosting these instances. I feel that this issue should be discussed again and consider the possibility to configure the max_unit value. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1918419 Title: vCPU resource max_unit is hardcoded Status in OpenStack Compute (nova): New Bug description: Becasue the spectre/meltdown vulnerabilities (2018) we needed to disable SMT in all public facing compute nodes. As result the number of available cores was reduced by half. We had flavors available with 32vCPUs that couldn't be used anymore because placement max_unit for vCPUs is hardcoded to be the total number of cpus regardless the allocation_ratio. To me it's a sensible default but doesn't offer any flexibility for operators. See the IRC discussion at that time: http://eavesdrop.openstack.org/irclogs/%23openstack-placement/%23openstack-placement.2018-09-20.log.html As conclusion, we informed the users that we couldn't offer those flavors anymore. The old VMs (that were created before disabling SMT) continued to run without any issue. So... after ~2 year I'm hitting again this problem :) These compute nodes need now to be retired and we are live migrating all the instances to the replacement hardware. When trying to live migrate these instances (vCPUs > max_unit) it fails, becasue the migration allocation can't be created against the source compute node. For the new hardware (dest_compute) the vCPUS < max_unit, so no issue for the new allocation. I'm working around this problem (to live migrate the instances), patching the code to have a higher max_unit for vCPUs in the compute nodes hosting these instances. I feel that this issue should be discussed again and consider the possibility to configure the max_unit value. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1918419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

