Reviewed: https://review.opendev.org/c/openstack/nova/+/926521 Committed: https://opendev.org/openstack/nova/commit/ab18f3763c096d1f4c0da6ad825d670dd5a06b94 Submitter: "Zuul (22348)" Branch: master
commit ab18f3763c096d1f4c0da6ad825d670dd5a06b94 Author: Amit Uniyal <auni...@redhat.com> Date: Mon Aug 19 07:42:43 2024 +0000 Libvirt: updates resource provider trait list This change updates resource provider trait list for hw architecture and hw emulation architecture Closes-Bug: #2062425 Change-Id: Ia571c5e5e881162d331b638ae2d4a332807d17f5 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2062425 Title: Nova/Placement creating x86 trait for ARM Compute node Status in OpenStack Compute (nova): Fix Released Bug description: Description =========== I have a 2023.2 based deployment with both x84 and aarch64 based compute nodes. For the arm node, placement is showing it having an x86 HW trait, causing scheduling of arm architecture images onto it to fail. It also causes it to try and schedule x86 images onto here, which will fail. Steps to reproduce ================== 1. I deployed a new 2023.2 deployment with Kolla-ansible. 2. Add hw_architecture=aarch64 to a valid glance image 3. Ensure that image_metadata_prefilter = True in nova.conf on all nova services 4. Try and deploy an instance with that image, it will fail with no valid host found 5. Observe the following in the placement-api logs: placement-api.log:41054:2024-04-18 20:39:04.271 21 DEBUG placement.requestlog [req-0114c318-5dfd-4588-807b-e591a82ce098 req- bd588ea0-5700-4b8e-a43f-0eb15a7275e8 - - - - - -] Starting request: 10.27.10.33 "GET /allocation_candidates?limit=1000&member_of=in%3Aceceb7fb-e0ed-4304-a69f-b327da7ca63f&resources=DISK_GB%3A60%2CMEMORY_MB%3A8192%2CVCPU%3A4&root_required=HW_ARCH_AARCH64%2C%21COMPUTE_STATUS_DISABLED" __call__ /var/lib/kolla/venv/lib/python3.10/site- packages/placement/requestlog.py:55 placement-api.log:41055:2024-04-18 20:39:04.317 21 DEBUG placement.objects.research_context [req-0114c318-5dfd-4588-807b-e591a82ce098 req- bd588ea0-5700-4b8e-a43f-0eb15a7275e8 8ce24731fb34492c9354f05050216395 c48da85ca48f4296b59bacb7b3c2fdfd - - default default] found no providers satisfying required traits: {'HW_ARCH_AARCH64'} and forbidden traits: {'COMPUTE_STATUS_DISABLED'} _process_anchor_traits /var/lib/kolla/venv/lib/python3.10/site- packages/placement/objects/research_context.py:243 Resource providers: openstack resource provider list +--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+ | uuid | name | generation | root_provider_uuid | parent_provider_uuid | +--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+ | a6aa43fb-c819-4dae-b172-b5ed76901591 | infra-prod-compute-04 | 7 | a6aa43fb-c819-4dae-b172-b5ed76901591 | None | | 2a019b35-25ac-4085-a13d-07802bda6828 | infra-prod-compute-03 | 10 | 2a019b35-25ac-4085-a13d-07802bda6828 | None | | a008c58b-d16c-4b80-8f58-ca96d1fce2a3 | infra-prod-compute-05 | 7 | a008c58b-d16c-4b80-8f58-ca96d1fce2a3 | None | | e97340aa-5848-4939-a409-701e5ad52396 | infra-prod-compute-02 | 31 | e97340aa-5848-4939-a409-701e5ad52396 | None | | 9345e4d0-fc49-4e51-9f38-faeabec1b053 | infra-prod-compute-01 | 18 | 9345e4d0-fc49-4e51-9f38-faeabec1b053 | None | | 41611dae-3006-4449-9c8b-3369d9b0feb8 | infra-prod-compile-01 | 5 | 41611dae-3006-4449-9c8b-3369d9b0feb8 | None | | 7fecff4c-9e2d-4d89-a345-91ab4d8c1857 | infra-prod-compile-02 | 5 | 7fecff4c-9e2d-4d89-a345-91ab4d8c1857 | None | | fbd4030a-1cc9-455a-bca2-2b606fcb3c4d | infra-prod-compile-03 | 5 | fbd4030a-1cc9-455a-bca2-2b606fcb3c4d | None | | 4d3b29fd-0048-4768-93fa-b7a98f81c125 | infra-prod-compute-06 | 9 | 4d3b29fd-0048-4768-93fa-b7a98f81c125 | None | | f888bda6-8fb7-4f84-8b87-c9af3b36a6ae | infra-prod-compute-07 | 7 | f888bda6-8fb7-4f84-8b87-c9af3b36a6ae | None | | 4f53c8d0-bf1d-44d3-89d5-b8f5436ee66a | infra-prod-compile-04 | 5 | 4f53c8d0-bf1d-44d3-89d5-b8f5436ee66a | None | | 7b6a42c8-b9b4-44a6-9111-2f732c7074e1 | infra-prod-compile-05 | 5 | 7b6a42c8-b9b4-44a6-9111-2f732c7074e1 | None | | 8312a824-8d88-4646-9eb5-c4937329dab9 | infra-prod-compute-08 | 4 | 8312a824-8d88-4646-9eb5-c4937329dab9 | None | | 9e60caa5-28ed-4719-aaf5-690b111f17fd | infra-prod-compute-09 | 4 | 9e60caa5-28ed-4719-aaf5-690b111f17fd | None | | cbfef7fd-b910-4d77-b448-70cdb9638967 | infra-prod-compute-10 | 4 | cbfef7fd-b910-4d77-b448-70cdb9638967 | None | | d7efda90-b91c-419f-b0be-0f339f37653a | infra-prod-compute-11 | 4 | d7efda90-b91c-419f-b0be-0f339f37653a | None | | 067f20f4-f513-465e-9e32-e505a97ab165 | infra-prod-compute-12 | 4 | 067f20f4-f513-465e-9e32-e505a97ab165 | None | | 57a098bf-31d4-4e4f-9a28-72a925d2384c | infra-prod-arm-compute-01 | 12 | 57a098bf-31d4-4e4f-9a28-72a925d2384c | None | | 632c23d6-63df-4143-9d4c-deb2bdc94c80 | infra-prod-compute-13 | 4 | 632c23d6-63df-4143-9d4c-deb2bdc94c80 | None | | 0fe3d535-8aec-4307-943e-2c46b01bc019 | infra-prod-compute-14 | 4 | 0fe3d535-8aec-4307-943e-2c46b01bc019 | None | | 8f60a0e9-2510-48ce-b305-6937314bac4a | infra-prod-compute-15 | 4 | 8f60a0e9-2510-48ce-b305-6937314bac4a | None | +--------------------------------------+---------------------------+------------+--------------------------------------+----------------------+ Traits showing for the arm node (notice no HW_ARCH_AARCH64): openstack resource provider trait list 57a098bf-31d4-4e4f-9a28-72a925d2384c +---------------------------------------+ | name | +---------------------------------------+ | COMPUTE_IMAGE_TYPE_QCOW2 | | COMPUTE_ADDRESS_SPACE_EMULATED | | COMPUTE_NET_VIF_MODEL_VMXNET3 | | COMPUTE_GRAPHICS_MODEL_NONE | | COMPUTE_IMAGE_TYPE_ISO | | COMPUTE_DEVICE_TAGGING | | COMPUTE_NET_VIF_MODEL_NE2K_PCI | | COMPUTE_GRAPHICS_MODEL_VIRTIO | | COMPUTE_RESCUE_BFV | | COMPUTE_STORAGE_BUS_VIRTIO | | COMPUTE_STORAGE_BUS_SCSI | | COMPUTE_GRAPHICS_MODEL_VGA | | COMPUTE_IMAGE_TYPE_AMI | | COMPUTE_NET_VIF_MODEL_E1000 | | COMPUTE_STORAGE_BUS_SATA | | COMPUTE_NET_VIF_MODEL_PCNET | | COMPUTE_NET_ATTACH_INTERFACE | | HW_CPU_X86_AESNI | | COMPUTE_STORAGE_BUS_USB | | COMPUTE_ADDRESS_SPACE_PASSTHROUGH | | COMPUTE_NET_VIF_MODEL_RTL8139 | | COMPUTE_NET_ATTACH_INTERFACE_WITH_TAG | | COMPUTE_VOLUME_ATTACH_WITH_TAG | | COMPUTE_TRUSTED_CERTS | | COMPUTE_IMAGE_TYPE_AKI | | COMPUTE_VIOMMU_MODEL_SMMUV3 | | COMPUTE_STORAGE_BUS_FDC | | COMPUTE_VIOMMU_MODEL_AUTO | | COMPUTE_VOLUME_EXTEND | | COMPUTE_SOCKET_PCI_NUMA_AFFINITY | | COMPUTE_NET_VIF_MODEL_E1000E | | COMPUTE_NODE | | COMPUTE_ACCELERATORS | | COMPUTE_IMAGE_TYPE_RAW | | COMPUTE_VOLUME_MULTI_ATTACH | | COMPUTE_IMAGE_TYPE_ARI | | COMPUTE_GRAPHICS_MODEL_BOCHS | | COMPUTE_NET_VIF_MODEL_SPAPR_VLAN | | COMPUTE_GRAPHICS_MODEL_CIRRUS | | COMPUTE_GRAPHICS_MODEL_VMVGA | | COMPUTE_NET_VIF_MODEL_VIRTIO | | COMPUTE_VIOMMU_MODEL_VIRTIO | +---------------------------------------+ Confirmation that it is an arm based system: root@infra-prod-arm-compute-01:/etc/kolla/nova-libvirt# uname -a Linux infra-prod-arm-compute-01 5.15.0-102-generic #112-Ubuntu SMP Tue Mar 5 16:49:56 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux On the startup of the nova-compute instance on this compute, I can see the libvirt output shows as much: 2024-04-18 21:47:43.978 7 INFO nova.service [-] Starting compute node (version 28.0.2) 2024-04-18 21:47:44.000 7 INFO nova.virt.node [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Determined node identity 57a098bf-31d4-4e4f-9a28-72a925d2384c from /var/lib/nova/compute_id 2024-04-18 21:47:44.021 7 INFO nova.virt.libvirt.driver [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Connection event '1' reason 'None' 2024-04-18 21:47:44.460 7 INFO nova.virt.libvirt.host [None req-58e563b8-cf35-4973-be78-d43cab808258 - - - - - -] Libvirt host capabilities <capabilities> <host> <uuid>38393550-3736-4753-4833-3334564b5842</uuid> <cpu> <arch>aarch64</arch> <model>Neoverse-N1</model> <vendor>ARM</vendor> <topology sockets='1' dies='1' cores='128' threads='1'/> nova.conf for the nova-compute service on that node: [DEFAULT] debug = False log_dir = /var/log/kolla/nova state_path = /var/lib/nova allow_resize_to_same_host = true compute_driver = libvirt.LibvirtDriver my_ip = <ip> transport_url = rabbit://<url> default_schedule_zone = nova [conductor] workers = 5 [vnc] novncproxy_host = <ip> novncproxy_port = 6080 server_listen = <ip> server_proxyclient_address = <ip> novncproxy_base_url = https://example.com:6080/vnc_lite.html [serial_console] enabled = true base_url = wss://example.com:6083/ serialproxy_host = <ip> serialproxy_port = 6083 proxyclient_address = <ip> [oslo_concurrency] lock_path = /var/lib/nova/tmp [glance] debug = False api_servers = http://<ip>:9292 cafile = num_retries = 3 [cinder] catalog_info = volumev3:cinderv3:internalURL os_region_name = RegionOne auth_url = http://<ip>:5000 auth_type = password project_domain_name = Default user_domain_id = default project_name = service username = cinder password = <pw> cafile = [neutron] metadata_proxy_shared_secret = <secret> service_metadata_proxy = true auth_url = http://<ip>:5000 auth_type = password cafile = project_domain_name = Default user_domain_id = default project_name = service username = neutron password = <pw> region_name = Westford valid_interfaces = RegionOne [libvirt] connection_uri = qemu+tcp://<ip>/system live_migration_inbound_addr = <ip> images_type = rbd images_rbd_pool = vms images_rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder disk_cachemodes = network=writeback hw_disk_discard = unmap rbd_secret_uuid = 48d56060-bcf0-4f94-bee8-83ab18eaabbd virt_type = kvm cpu_mode = host-passthrough num_pcie_ports = 16 [workarounds] skip_cpu_compare_on_dest = True [upgrade_levels] compute = auto [oslo_messaging_notifications] transport_url = rabbit://<url> driver = messagingv2 topics = notifications_designate [oslo_messaging_rabbit] heartbeat_in_pthread = false amqp_durable_queues = true [privsep_entrypoint] helper_command = sudo nova-rootwrap /etc/nova/rootwrap.conf privsep-helper --config-file /etc/nova/nova.conf [guestfs] debug = False [placement] auth_type = password auth_url = http://<ip>:5000 username = placement password = <pw> user_domain_name = Default project_name = service project_domain_name = Default region_name = RegionOne cafile = valid_interfaces = internal [notifications] notify_on_state_change = vm_and_task_state [barbican] auth_endpoint = http://<ip>:5000 barbican_endpoint_type = internal verify_ssl_path = [service_user] send_service_user_token = true auth_url = http://<ip>:5000 auth_type = password project_domain_id = default user_domain_id = default project_name = service username = nova password = <pw> cafile = region_name = RegionOne valid_interfaces = internal [scheduler] image_metadata_prefilter = True I have tried to run openstack resource provider trait delete 57a098bf-31d4-4e4f-9a28-72a925d2384c to delete all traits, then restarted the nova_compute on this compute node, however the same traits come back. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2062425/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp