On Tue, May 20, 2025 at 02:07:48AM +0000, Michael Kelley wrote: > From: Saurabh Singh Sengar <ssen...@linux.microsoft.com> Sent: Monday, May > 19, 2025 9:55 AM > > > > On Sat, May 17, 2025 at 06:47:22PM +0000, Michael Kelley wrote: > > > From: Saurabh Singh Sengar <ssen...@linux.microsoft.com> Sent: Saturday, > > > May 17, 2025 9:14 AM > > > > > > > > On Sat, May 17, 2025 at 01:34:20PM +0000, Michael Kelley wrote: > > > > > From: Saurabh Singh Sengar <ssen...@microsoft.com> Sent: Friday, May > > > > > 16, 2025 9:38 PM > > > > > > > > > > > > > From: Michael Kelley <mhkli...@outlook.com> > > > > > > > > > > > > > > The Hyper-V host provides guest VMs with a range of MMIO > > > > > > > addresses that > > > > > > > guest VMBus drivers can use. The VMBus driver in Linux manages > > > > > > > that MMIO > > > > > > > space, and allocates portions to drivers upon request. As part of > > > > > > > managing > > > > > > > that MMIO space in a Generation 2 VM, the VMBus driver must > > > > > > > reserve the > > > > > > > portion of the MMIO space that Hyper-V has designated for the > > > > > > > synthetic > > > > > > > frame buffer, and not allocate this space to VMBus drivers other > > > > > > > than graphics > > > > > > > framebuffer drivers. The synthetic frame buffer MMIO area is > > > > > > > described by > > > > > > > the screen_info data structure that is passed to the Linux kernel > > > > > > > at boot time, > > > > > > > so the VMBus driver must access screen_info for Generation 2 VMs. > > > > > > > (In > > > > > > > Generation 1 VMs, the framebuffer MMIO space is communicated to > > > > > > > the > > > > > > > guest via a PCI pseudo-device, and access to screen_info is not > > > > > > > needed.) > > > > > > > > > > > > > > In commit a07b50d80ab6 ("hyperv: avoid dependency on > > > > > > > screen_info") the > > > > > > > VMBus driver's access to screen_info is restricted to when > > > > > > > CONFIG_SYSFB is > > > > > > > enabled. CONFIG_SYSFB is typically enabled in kernels built for > > > > > > > Hyper-V by > > > > > > > virtue of having at least one of CONFIG_FB_EFI, CONFIG_FB_VESA, or > > > > > > > CONFIG_SYSFB_SIMPLEFB enabled, so the restriction doesn't usually > > > > > > > affect > > > > > > > anything. But it's valid to have none of these enabled, in which > > > > > > > case > > > > > > > CONFIG_SYSFB is not enabled, and the VMBus driver is unable to > > > > > > > properly > > > > > > > reserve the framebuffer MMIO space for graphics framebuffer > > > > > > > drivers. The > > > > > > > framebuffer MMIO space may be assigned to some other VMBus > > > > > > > driver, with > > > > > > > undefined results. As an example, if a VM is using a PCI > > > > > > > pass-thru NVMe > > > > > > > controller to host the OS disk, the PCI NVMe controller is probed > > > > > > > before any > > > > > > > graphic devices, and the NVMe controller is assigned a portion of > > > > > > > the > > > > > > > framebuffer MMIO space. > > > > > > > Hyper-V reports an error to Linux during the probe, and the OS > > > > > > > disk fails to > > > > > > > get setup. Then Linux fails to boot in the VM. > > > > > > > > > > > > > > Fix this by having CONFIG_HYPERV always select SYSFB. Then the > > > > > > > VMBus > > > > > > > driver in a Gen 2 VM can always reserve the MMIO space for the > > > > > > > graphics > > > > > > > framebuffer driver, and prevent the undefined behavior. > > > > > > > > > > > > One question: Shouldn't the SYSFB be selected by actual graphics > > > > > > framebuffer driver > > > > > > which is expected to use it. With this patch this option will be > > > > > > enabled irrespective > > > > > > if there is any user for it or not, wondering if we can better > > > > > > optimize it for such systems. > > > > > > > > > > > > > > > > That approach doesn't work. For a cloud-based server, it might make > > > > > sense to build a kernel image without either of the Hyper-V graphics > > > > > framebuffer drivers (DRM_HYPERV or HYPERV_FB) since in that case the > > > > > Linux console is the serial console. But the problem could still occur > > > > > where a PCI pass-thru NVMe controller tries to use the MMIO space > > > > > that Hyper-V intends for the framebuffer. That problem is directly > > > > > tied > > > > > to CONFIG_SYSFB because it's the VMBus driver that must treat the > > > > > framebuffer MMIO space as special. The absence or presence of a > > > > > framebuffer driver isn't the key factor, though we've been > > > > > (incorrectly) > > > > > relying on the presence of a framebuffer driver to set CONFIG_SYSFB. > > > > > > > > > > > > > Thank you for the clarification. I was concerned because SYSFB is not > > > > currently > > > > enabled in the OpenHCL kernel, and our goal is to keep the OpenHCL > > > > configuration > > > > as minimal as possible. I haven't yet looked into the details to > > > > determine > > > > whether this might have any impact on the kernel binary size or runtime > > > > memory > > > > usage. I trust this won't affect negatively. > > > > > > > > OpenHCL Config Ref: > > > > https://github.com/microsoft/OHCL-Linux-Kernel/blob/product/hcl-main/6.12/Microsoft/hcl-x64.config > > > > > > > > > > Good point. > > > > > > The OpenHCL code tree has commit a07b50d80ab6 that restricts the > > > screen_info to being available only when CONFIG_SYSFB is enabled. > > > But since OpenHCL in VTL2 gets its firmware info via OF instead of ACPI, > > > I'm unsure what the Hyper-V host tells it about available MMIO space, > > > and whether that space includes MMIO space for a framebuffer. If it > > > doesn't, then OpenHCL won't have the problem I describe above, and > > > it won't need CONFIG_SYSFB. This patch could be modified to do > > > > > > select SYSFB if !HYPERV_VTL_MODE > > > > I am worried that this is not very scalable, there could be more such > > Hyper-V systems in future. > > I could see scalability being a problem if there were 20 more such > Hyper-V systems in the future. But if there are just 2 or 3 more, that > seems like it would be manageable. > > Regardless, I'm OK with doing this with or without the > "if !HYPERV_VTL_MODE". I don't think we should just drop this > entirely. When playing around with various framebuffers drivers > a few weeks back, I personally encountered the problem of having > built a kernel that wouldn't boot in an Azure VM with an NVMe OS > disk. I couldn't figure out why probing the NVMe controller failed. > It took me an hour to sort out what was happening, and I was > familiar with the Hyper-V PCI driver. I'd like to prevent such a > problem from happening to someone else.
I agree we want to fix this. > > > > > > > > > Can you find out what MMIO space Hyper-V provides to VTL2 via OF? > > > It would make sense if no framebuffer is provided. And maybe > > > screen_info itself is not set up when VTL2 is loaded, which would > > > also make adding CONFIG_SYSFB pointless for VTL2. > > > > I can only see below address range passed for MMIO to VMBus driver: > > ranges = <0x0f 0xf0000000 0x0f 0xf0000000 0x10000000>; > > I'm guessing the above text is what shows up in DT? I'm not sure > how to interpret it. In normal guests, Hyper-V offers a "low MMIO" > range that is below the 4 GiB line, and a "high MMIO" range that > is just before the 64 GiB line. In a normal guest in Azure, I see the > MLX driver using 0xfc0000000, which would be just below the 64 GiB > line, and in the "high MMIO" range. The "0x0f 0xf0000000" in DT might > be physical address 0xff0000000, which is consistent with the > "high MMIO" range. I'm not sure how to interpret the second > occurrence of "0xf 0xf0000000". I'm guessing the 0x10000000 > (256 Mib) is the length of the available range, which would also > make sense. > > The framebuffer address is always in the "low MMIO" range. So > if my interpretation is anywhere close to correct, DT isn't > specifying any MMIO space for a framebuffer, and there's > no need for CONFIG_SYSFB in a kernel running in VTL2. > > What's your preference for how to proceed? Adding CONFIG_SYSFB > probably *will* increase the kernel code size, but I don't know > by how much. I can do a measurement. My primary preference is to ensure that OpenHCL remains unaffected. And since there are no better alternatives I can think of, I am fine proceeding with !HYPERV_VTL_MODE with that, Reviewed-by: Saurabh Sengar <ssen...@linux.microsoft.com> - Saurabh