On 1/31/25 18:15, Cédric Le Goater wrote: > On 1/30/25 21:41, Alex Williamson wrote: >> On Fri, 31 Jan 2025 02:33:03 +0800 >> Tomita Moeko <tomitamo...@gmail.com> wrote: >> >>> On 1/25/25 15:42, Tomita Moeko wrote: >>>> On 1/25/25 05:13, Alex Williamson wrote: >>>>> On Sat, 25 Jan 2025 03:12:45 +0800 >>>>> Tomita Moeko <tomitamo...@gmail.com> wrote: >>>>> >>>>>> Both enable opregion option (x-igd-opregion) and legacy mode require >>>>>> setting up OpRegion copy for IGD devices. Move x-igd-opregion handler >>>>>> in vfio_realize() to vfio_probe_igd_config_quirk() to elimate duplicate >>>>>> code. Finally we moved all the IGD-related code into igd.c. >>>>>> >>>>>> Signed-off-by: Tomita Moeko <tomitamo...@gmail.com> >>>>>> --- >>>>>> hw/vfio/igd.c | 143 +++++++++++++++++++++++++++++++++---------- >>>>>> hw/vfio/pci-quirks.c | 50 --------------- >>>>>> hw/vfio/pci.c | 25 -------- >>>>>> hw/vfio/pci.h | 4 -- >>>>>> 4 files changed, 110 insertions(+), 112 deletions(-) >>>>>> >>>>>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c >>>>>> index 6e06dd774a..015beacf5f 100644 >>>>>> --- a/hw/vfio/igd.c >>>>>> +++ b/hw/vfio/igd.c >>>>>> @@ -106,6 +106,7 @@ static int igd_gen(VFIOPCIDevice *vdev) >>>>>> return -1; >>>>>> } >>>>>> +#define IGD_ASLS 0xfc /* ASL Storage Register */ >>>>>> #define IGD_GMCH 0x50 /* Graphics Control Register */ >>>>>> #define IGD_BDSM 0x5c /* Base Data of Stolen Memory */ >>>>>> #define IGD_BDSM_GEN11 0xc0 /* Base Data of Stolen Memory of gen 11 >>>>>> and later */ >>>>>> @@ -138,6 +139,55 @@ static uint64_t igd_stolen_memory_size(int gen, >>>>>> uint32_t gmch) >>>>>> return 0; >>>>>> } >>>>>> +/* >>>>>> + * The OpRegion includes the Video BIOS Table, which seems important for >>>>>> + * telling the driver what sort of outputs it has. Without this, the >>>>>> device >>>>>> + * may work in the guest, but we may not get output. This also >>>>>> requires BIOS >>>>>> + * support to reserve and populate a section of guest memory sufficient >>>>>> for >>>>>> + * the table and to write the base address of that memory to the ASLS >>>>>> register >>>>>> + * of the IGD device. >>>>>> + */ >>>>>> +static bool vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev, >>>>>> + struct vfio_region_info *info, >>>>>> + Error **errp) >>>>>> +{ >>>>>> + int ret; >>>>>> + >>>>>> + vdev->igd_opregion = g_malloc0(info->size); >>>>>> + ret = pread(vdev->vbasedev.fd, vdev->igd_opregion, >>>>>> + info->size, info->offset); >>>>>> + if (ret != info->size) { >>>>>> + error_setg(errp, "failed to read IGD OpRegion"); >>>>>> + g_free(vdev->igd_opregion); >>>>>> + vdev->igd_opregion = NULL; >>>>>> + return false; >>>>>> + } >>>>>> + >>>>>> + /* >>>>>> + * Provide fw_cfg with a copy of the OpRegion which the VM firmware >>>>>> is to >>>>>> + * allocate 32bit reserved memory for, copy these contents into, >>>>>> and write >>>>>> + * the reserved memory base address to the device ASLS register at >>>>>> 0xFC. >>>>>> + * Alignment of this reserved region seems flexible, but using a 4k >>>>>> page >>>>>> + * alignment seems to work well. This interface assumes a single >>>>>> IGD >>>>>> + * device, which may be at VM address 00:02.0 in legacy mode or >>>>>> another >>>>>> + * address in UPT mode. >>>>>> + * >>>>>> + * NB, there may be future use cases discovered where the VM should >>>>>> have >>>>>> + * direct interaction with the host OpRegion, in which case the >>>>>> write to >>>>>> + * the ASLS register would trigger MemoryRegion setup to enable >>>>>> that. >>>>>> + */ >>>>>> + fw_cfg_add_file(fw_cfg_find(), "etc/igd-opregion", >>>>>> + vdev->igd_opregion, info->size); >>>>>> + >>>>>> + trace_vfio_pci_igd_opregion_enabled(vdev->vbasedev.name); >>>>>> + >>>>>> + pci_set_long(vdev->pdev.config + IGD_ASLS, 0); >>>>>> + pci_set_long(vdev->pdev.wmask + IGD_ASLS, ~0); >>>>>> + pci_set_long(vdev->emulated_config_bits + IGD_ASLS, ~0); >>>>>> + >>>>>> + return true; >>>>>> +} >>>>>> + >>>>>> /* >>>>>> * The rather short list of registers that we copy from the host >>>>>> devices. >>>>>> * The LPC/ISA bridge values are definitely needed to support the >>>>>> vBIOS, the >>>>>> @@ -339,29 +389,83 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice >>>>>> *vdev, int nr) >>>>>> QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, bdsm_quirk, next); >>>>>> } >>>>>> +static bool vfio_igd_try_enable_opregion(VFIOPCIDevice *vdev, Error >>>>>> **errp) >>>>>> +{ >>>>>> + g_autofree struct vfio_region_info *opregion = NULL; >>>>>> + int ret; >>>>>> + >>>>>> + /* >>>>>> + * Hotplugging is not supprted for both opregion access and legacy >>>>>> mode. >>>>>> + * For legacy mode, we also need to mark the ROM failed. >>>>>> + */ >>>>> >>>>> The explanation was a little better in the removed comment. >>>>> >>>>>> + if (vdev->pdev.qdev.hotplugged) { >>>>>> + vdev->rom_read_failed = true; >>>>>> + error_setg(errp, >>>>>> + "IGD OpRegion is not supported on hotplugged >>>>>> device"); >>>>> >>>>> As was the error log. >>>>> >>>>>> + return false; >>>>>> + } >>>>>> + >>>>>> + ret = vfio_get_dev_region_info(&vdev->vbasedev, >>>>>> + VFIO_REGION_TYPE_PCI_VENDOR_TYPE | >>>>>> PCI_VENDOR_ID_INTEL, >>>>>> + VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION, &opregion); >>>>>> + if (ret) { >>>>>> + error_setg_errno(errp, -ret, >>>>>> + "device does not supports IGD OpRegion >>>>>> feature"); >>>>>> + return false; >>>>>> + } >>>>>> + >>>>>> + if (!vfio_pci_igd_opregion_init(vdev, opregion, errp)) { >>>>>> + return false; >>>>>> + } >>>>>> + >>>>>> + return true; >>>>>> +} >>>>>> + >>>>>> bool vfio_probe_igd_config_quirk(VFIOPCIDevice *vdev, >>>>>> - Error **errp G_GNUC_UNUSED) >>>>>> + Error **errp) >>>>>> { >>>>>> g_autofree struct vfio_region_info *rom = NULL; >>>>>> - g_autofree struct vfio_region_info *opregion = NULL; >>>>>> g_autofree struct vfio_region_info *host = NULL; >>>>>> g_autofree struct vfio_region_info *lpc = NULL; >>>>>> + PCIBus *bus; >>>>>> PCIDevice *lpc_bridge; >>>>>> int ret, gen; >>>>>> + bool legacy_mode, enable_opregion; >>>>>> uint64_t gms_size; >>>>>> uint64_t *bdsm_size; >>>>>> uint32_t gmch; >>>>>> Error *err = NULL; >>>>>> + if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) || >>>>>> + !vfio_is_vga(vdev)) { >>>>>> + return true; >>>>>> + } >>>>>> + >>>>>> /* >>>>>> * This must be an Intel VGA device at address 00:02.0 for us to >>>>>> even >>>>>> * consider enabling legacy mode. The vBIOS has dependencies on >>>>>> the >>>>>> * PCI bus address. >>>>>> */ >>>>>> - if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) || >>>>>> - !vfio_is_vga(vdev) || >>>>>> - &vdev->pdev != pci_find_device(pci_device_root_bus(&vdev->pdev), >>>>>> - 0, PCI_DEVFN(0x2, 0))) { >>>>>> + bus = pci_device_root_bus(&vdev->pdev); >>>>>> + legacy_mode = (&vdev->pdev == pci_find_device(bus, 0, >>>>>> PCI_DEVFN(0x2, 0))); >>>>>> + enable_opregion = (vdev->features & >>>>>> VFIO_FEATURE_ENABLE_IGD_OPREGION); >>>>>> + >>>>>> + if (!enable_opregion && !legacy_mode) { >>>>>> + return true; >>>>>> + } >>>>>> + >>>>>> + if (!vfio_igd_try_enable_opregion(vdev, &err)) { >>>>>> + if (enable_opregion) { >>>>>> + error_propagate(errp, err); >>>>>> + return false; >>>>>> + } else if (legacy_mode) { >>>>>> + error_append_hint(&err, "IGD legacy mode disabled\n"); >>>>>> + error_report_err(err); >>>>>> + return true; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + if (!legacy_mode) { >>>>>> return true; >>>>>> } >>>>>> @@ -404,30 +508,10 @@ bool vfio_probe_igd_config_quirk(VFIOPCIDevice >>>>>> *vdev, >>>>>> return true; >>>>>> } >>>>>> - /* >>>>>> - * Ignore the hotplug corner case, mark the ROM failed, we can't >>>>>> - * create the devices we need for legacy mode in the hotplug >>>>>> scenario. >>>>>> - */ >>>>>> - if (vdev->pdev.qdev.hotplugged) { >>>>>> - error_report("IGD device %s hotplugged, ROM disabled, " >>>>>> - "legacy mode disabled", vdev->vbasedev.name); >>>>>> - vdev->rom_read_failed = true; >>>>>> - return true; >>>>>> - } >>>>>> - >>>>>> /* >>>>>> * Check whether we have all the vfio device specific regions to >>>>>> * support legacy mode (added in Linux v4.6). If not, bail. >>>>>> */ >>>>>> And we're disassociating opregion setup from this useful comment. >>>>> >>>>> What are we actually accomplishing here? What specific code >>>>> duplication is eliminated? >>>> >>>> This patch is designed for moving the opregion quirk in vfio_realize() >>>> to igd.c, for better isolation of the igd-specific code. Legacy mode >>>> also need to initialize opregion as x-igd-opregion=on option. These >>>> code are almost the same, except legacy mode continues on error, while >>>> x-igd-opregion fails. >>>> >>>> I am going to clearify that in the commit message of v3. >>>> >>>>> Why is it important to move all this code to igd.c? >>> >>> x-igd-opreqion quirk is currently located in pci-quirks.c, which is not >>> controlled by CONFIG_VFIO_IGD, moving it to igd.c prevents building >>> that unnecessary code in certain binaries, for example, non x86 builds. >>> >>>>> It's really difficult to untangle this refactor, I think it could be >>>>> done more iteratively, if it's really even beneficial. Thanks, >>>>> >>>>> Alex >>>> >>>> Agreed. Actually I'd like to totally remove the "legacy mode" and "UPT >>>> mode" concept in future, my proposal is: >>>> * Emulate and initialize ASLS and BDSM register unconditionally. These >>>> registers holds HPA, keeping the old value to guest is not a good >>>> idea >>>> * Make the host bridge and LPC bridge ID quirks optional like OpRegion. >>>> Recent Linux kernel and Windows driver seems not relying on it. This >>>> enables IGD passthrough on Q35 machines, but probably without UEFI >>>> GOP or VBIOS support, as it is observed they require specific LPC >>>> bridge DID to work. >>>> * Remove the requirement of IGD device class being VGA controller, this >>>> was previous discussed in my kernel change [1] >>>> * Update the document >>>> >>>> It would time consuming to implement all them, coding is not difficult, >>>> but I have to verify my change on diffrent platforms. And they are out >>>> of this patchset's scope I think. I personally perfers doing it in a >>>> future patchset. >>>> >>>> [1] >>>> https://lore.kernel.org/all/20250123163416.7653-1-tomitamo...@gmail.com/ >>>> >>>> Thanks, >>>> Moeko >>> >>> Please let me know if you have any thoughts or suggestions, in case >>> you missed the previous mail. >> >> TBH, I'm surprised there's so much interest in direct assignment of >> igd. I'd be curious in your motivation, if you can share it. >> >> Regardless, it's nice to see it updated for newer hardware and I don't >> mind the goal of isolating the code so it can be disabled on other >> archs, so long as we can do so in small, logical steps that are easy to >> follow. >> >> At this point, the idea of legacy vs UPT might only exist in QEMU. >> There are going to be some challenges to avoid breaking existing VM >> command lines if the host and LPC bridge quirks become optional. There >> are a couple x-igd- options that we're free to break as they've always >> been experimental, but the implicit LPC bridge and host bridge quirks >> need to be considered carefully. The fact that "legacy" mode has never >> previously worked on q35 could mean that we can tie those quirks to a >> new experimental option that's off by default and only enabled for >> 440fx machine types. >> >> I'm glad you included the documentation update in your list, it's >> clearly out of date, as is some of my knowledge regarding guest driver >> requirements. > > Could we please have an update of docs/igd-assign.txt too ? > > As some point, we should consolidate all VFIO documentation under > one section. That's another topic. > >> I hope we can make some progress on uefi support as well, >> as that's essentially a requirement for newer guests. If we can't get >> the code upstream into edk2, maybe we can at least document steps for >> others to create images. Thanks, >> > > So, I am bit lot here, forgive my ignorance. > > I am seeing issues (a black screen and nothing else to report) with : > > 00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev > 0c) > > using uefi, seabios, pc or q35 does not change the result. > > > However, it works fine with a uefi q35 machine using : > > 00:02.0 VGA compatible controller: Intel Corporation Alder Lake-N [UHD > Graphics] > > How can I dig into the first issue ? >
If you are running a linux guest, `dmesg | grep i915` in guest is always a good start. Adding `drm_debug` to kernel cmdline logs more details. I mainly uses `drm_debug=0x6` for igd passthrough (Enable Driver and KMS logging). https://docs.kernel.org/gpu/drm-internals.html > Also, if we know that there are platform requirements for IGD assignment to > work, we > should try to verify that they are met when the machine boots. Well noted. > Thanks, > > C. > >