ok os repoducing this locally using https://review.opendev.org/c/openstack/oslo.utils/+/937037
the error is [18:59:29]➜ python3 -m oslo_utils.imageutils.format_inspector ../flatcar-stable-4081.2.0-kube-v1.30.1.img.raw inspecting file: ../flatcar-stable-4081.2.0-kube-v1.30.1.img.raw detected file format: gpt running safety checks... Safety check mbr on gpt failed because GPT MBR defines invalid extra partitions FAILED! Safety checks failed: mbr 1/1 failed and according to https://wiki.osdev.org/GPT#LBA_0\:_Protective_Master_Boot_Record """ The UEFI specification requires that the PMBR partition table contain one partition record, with the other three partition records set to zero.""" so i need to look at the detection code in oslo.utils to confirm but I'm 99% sure the flatcar image does not contain a valid PMBR record based on the uefi spec requirements. as such nova is correctly rejecting the image it may have been a working image but it does not look like its a valid one. I'm going to mark this as invalid for nova and add oslo.utils to the bug given this is shared code in the imageutils. i see a few paths forward. one close this as invalid and flatcar can make there images conform to the uefi spec. two add a compatibility flag that relaxes this constraint if opted into three relax it unconditionally the concern with 2 and 3 is that if the ovmf firmware in qemu or on real hardware ever enforces the requirement it will break in the future. option 1 means existing "working" but potentially invlid images will not work on OpenStack. there are a few things we need to confirm first does the flatcar image have multiple Partions in the PMBR as we can see form rocky 8 [18:59:40]❯ python3 -m oslo_utils.imageutils.format_inspector ../Rocky-8-GenericCloud-Base.latest.x86_64.raw inspecting file: ../Rocky-8-GenericCloud-Base.latest.x86_64.raw detected file format: gpt running safety checks... PASSED! having multiple partitions is ok, it's listing more then one in the first sector, the Protective Master Boot Record, that is invliad. second we need to see if we can find a direct refecne to the uefi requirement third we need to discussion with oslo and the other stakeholder if a compact mot is a viald approach or do we really want to require strict confromance. ** Also affects: oslo.utils Importance: Undecided Status: New ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2091114 Title: Nova validation checks checks reject valid UEFI image Status in OpenStack Compute (nova): Invalid Status in oslo.utils: New Bug description: This relates specifically to this image: https://storage.googleapis.com/artifacts.k8s-staging-capi-openstack.appspot.com/test/flatcar/flatcar-stable-4081.2.0-kube-v1.30.1.img However, the problem should be easy enough to understand just from the description here without downloading it. When attempting to boot the image in 2024.2 devstack we see the following failure: Dec 04 11:01:53 capo-e2e-controller.c.k8s-infra-e2e-boskos-107.internal nova- compute[114399]: ESC[01;33mWARNING oslo_utils.imageutils.format_inspector [ESC[01;36mNone req-993c42cb-8da1-4cc5-83fd-1c16c08cbc13 ESC[00;36mdemo demoESC[01;33m] ESC[01;35mESC[01;33mSafety check mbr on gpt failed because GPT MBR defines invalid extra partitionsESC[00m: oslo_utils.imageutils.format_inspector.SafetyViolation: GPT MBR defines invalid extra partitionsESC[00m There is an associated stack trace and the server enters the ERROR state. This is a QCOW2 image. After downloading it I can manually convert it to raw to inspect its partition table: > qemu-img convert -f qcow2 flatcar-stable-4081.2.0-kube-v1.30.1.img -O raw flatcar-stable-4081.2.0-kube-v1.30.1.img.raw > fdisk -l flatcar-stable-4081.2.0-kube-v1.30.1.img.raw Disk flatcar-stable-4081.2.0-kube-v1.30.1.img.raw: 20 GiB, 21474836480 bytes, 41943040 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: D814FAF6-AD0A-4FC1-8DE9-236755D902E5 Device Start End Sectors Size Type flatcar-stable-4081.2.0-kube-v1.30.1.img.raw1 4096 266239 262144 128M EFI System flatcar-stable-4081.2.0-kube-v1.30.1.img.raw2 266240 270335 4096 2M BIOS boot flatcar-stable-4081.2.0-kube-v1.30.1.img.raw3 270336 2367487 2097152 1G unknown flatcar-stable-4081.2.0-kube-v1.30.1.img.raw4 2367488 4464639 2097152 1G unknown flatcar-stable-4081.2.0-kube-v1.30.1.img.raw6 4464640 4726783 262144 128M Linux filesystem flatcar-stable-4081.2.0-kube-v1.30.1.img.raw7 4726784 4857855 131072 64M unknown flatcar-stable-4081.2.0-kube-v1.30.1.img.raw9 4857856 41943006 37085151 17.7G unknown We apparently have Nova configured to convert qcow2 images to raw before booting them, which we can also see in the logs: Dec 04 11:01:37 capo-e2e-controller.c.k8s-infra-e2e-boskos-107.internal nova- compute[114399]: ESC[00;32mDEBUG nova.virt.images [ESC[01;36mNone req-993c42cb-8da1-4cc5-83fd-1c16c08cbc13 ESC[00;36mdemo demoESC[00;32m] ESC[01;35mESC[00;32m945136cb-6cc4-4e09-a785-50eaa79e2b10 was qcow2, converting to rawESC[00m ESC[00;33m{{(pid=114399) fetch_to_raw /opt/stack/nova/nova/virt/images.py:254}}ESC[00mESC[00m Using a patch to oslo.utils from Stephen Finucane and adding some extra print statements of my own, it's clear that we're failing here: https://github.com/openstack/oslo.utils/blob/79f5ec658e2fee8ab46201a71faaff8d3b67a690/oslo_utils/imageutils/format_inspector.py#L1273-L1274 > ./venv/bin/python ./oslo_utils/imageutils/format_inspector.py /tmp/flatcar-stable-4081.2.0-kube-v1.30.1.img.raw inspecting file: /tmp/flatcar-stable-4081.2.0-kube-v1.30.1.raw detected file format: gpt running safety checks... i: 0, ostype: 12 i: 1, ostype: 238 i: 2, ostype: 0 i: 3, ostype: 0 valid_partions: [0, 1] Safety check mbr on gpt failed because GPT MBR defines invalid extra partitions FAILED! Safety checks failed: mbr 1/1 failed This code expects there to be exactly one partition with a non-zero partition type, and that this partition must be the first one. In this image, both of the first 2 partitions have a non-zero partition type. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2091114/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp