On Tue, May 20, 2025 at 02:07:48AM +0000, Michael Kelley wrote:
> From: Saurabh Singh Sengar <ssen...@linux.microsoft.com> Sent: Monday, May 
> 19, 2025 9:55 AM
> > 
> > On Sat, May 17, 2025 at 06:47:22PM +0000, Michael Kelley wrote:
> > > From: Saurabh Singh Sengar <ssen...@linux.microsoft.com> Sent: Saturday, 
> > > May 17, 2025 9:14 AM
> > > >
> > > > On Sat, May 17, 2025 at 01:34:20PM +0000, Michael Kelley wrote:
> > > > > From: Saurabh Singh Sengar <ssen...@microsoft.com> Sent: Friday, May 
> > > > > 16, 2025 9:38 PM
> > > > > >
> > > > > > > From: Michael Kelley <mhkli...@outlook.com>
> > > > > > >
> > > > > > > The Hyper-V host provides guest VMs with a range of MMIO 
> > > > > > > addresses that
> > > > > > > guest VMBus drivers can use. The VMBus driver in Linux manages 
> > > > > > > that MMIO
> > > > > > > space, and allocates portions to drivers upon request. As part of 
> > > > > > > managing
> > > > > > > that MMIO space in a Generation 2 VM, the VMBus driver must 
> > > > > > > reserve the
> > > > > > > portion of the MMIO space that Hyper-V has designated for the 
> > > > > > > synthetic
> > > > > > > frame buffer, and not allocate this space to VMBus drivers other 
> > > > > > > than graphics
> > > > > > > framebuffer drivers. The synthetic frame buffer MMIO area is 
> > > > > > > described by
> > > > > > > the screen_info data structure that is passed to the Linux kernel 
> > > > > > > at boot time,
> > > > > > > so the VMBus driver must access screen_info for Generation 2 VMs. 
> > > > > > > (In
> > > > > > > Generation 1 VMs, the framebuffer MMIO space is communicated to 
> > > > > > > the
> > > > > > > guest via a PCI pseudo-device, and access to screen_info is not 
> > > > > > > needed.)
> > > > > > >
> > > > > > > In commit a07b50d80ab6 ("hyperv: avoid dependency on 
> > > > > > > screen_info") the
> > > > > > > VMBus driver's access to screen_info is restricted to when 
> > > > > > > CONFIG_SYSFB is
> > > > > > > enabled. CONFIG_SYSFB is typically enabled in kernels built for 
> > > > > > > Hyper-V by
> > > > > > > virtue of having at least one of CONFIG_FB_EFI, CONFIG_FB_VESA, or
> > > > > > > CONFIG_SYSFB_SIMPLEFB enabled, so the restriction doesn't usually 
> > > > > > > affect
> > > > > > > anything. But it's valid to have none of these enabled, in which 
> > > > > > > case
> > > > > > > CONFIG_SYSFB is not enabled, and the VMBus driver is unable to 
> > > > > > > properly
> > > > > > > reserve the framebuffer MMIO space for graphics framebuffer 
> > > > > > > drivers. The
> > > > > > > framebuffer MMIO space may be assigned to some other VMBus 
> > > > > > > driver, with
> > > > > > > undefined results. As an example, if a VM is using a PCI 
> > > > > > > pass-thru NVMe
> > > > > > > controller to host the OS disk, the PCI NVMe controller is probed 
> > > > > > > before any
> > > > > > > graphic devices, and the NVMe controller is assigned a portion of 
> > > > > > > the
> > > > > > > framebuffer MMIO space.
> > > > > > > Hyper-V reports an error to Linux during the probe, and the OS 
> > > > > > > disk fails to
> > > > > > > get setup. Then Linux fails to boot in the VM.
> > > > > > >
> > > > > > > Fix this by having CONFIG_HYPERV always select SYSFB. Then the 
> > > > > > > VMBus
> > > > > > > driver in a Gen 2 VM can always reserve the MMIO space for the 
> > > > > > > graphics
> > > > > > > framebuffer driver, and prevent the undefined behavior.
> > > > > >
> > > > > > One question: Shouldn't the SYSFB be selected by actual graphics 
> > > > > > framebuffer driver
> > > > > > which is expected to use it. With this patch this option will be 
> > > > > > enabled irrespective
> > > > > > if there is any user for it or not, wondering if we can better 
> > > > > > optimize it for such systems.
> > > > > >
> > > > >
> > > > > That approach doesn't work. For a cloud-based server, it might make
> > > > > sense to build a kernel image without either of the Hyper-V graphics
> > > > > framebuffer drivers (DRM_HYPERV or HYPERV_FB) since in that case the
> > > > > Linux console is the serial console. But the problem could still occur
> > > > > where a PCI pass-thru NVMe controller tries to use the MMIO space
> > > > > that Hyper-V intends for the framebuffer. That problem is directly 
> > > > > tied
> > > > > to CONFIG_SYSFB because it's the VMBus driver that must treat the
> > > > > framebuffer MMIO space as special. The absence or presence of a
> > > > > framebuffer driver isn't the key factor, though we've been 
> > > > > (incorrectly)
> > > > > relying on the presence of a framebuffer driver to set CONFIG_SYSFB.
> > > > >
> > > >
> > > > Thank you for the clarification. I was concerned because SYSFB is not 
> > > > currently
> > > > enabled in the OpenHCL kernel, and our goal is to keep the OpenHCL 
> > > > configuration
> > > > as minimal as possible. I haven't yet looked into the details to 
> > > > determine
> > > > whether this might have any impact on the kernel binary size or runtime 
> > > > memory
> > > > usage. I trust this won't affect negatively.
> > > >
> > > > OpenHCL Config Ref:
> > > > https://github.com/microsoft/OHCL-Linux-Kernel/blob/product/hcl-main/6.12/Microsoft/hcl-x64.config
> > > >
> > >
> > > Good point.
> > >
> > > The OpenHCL code tree has commit a07b50d80ab6 that restricts the
> > > screen_info to being available only when CONFIG_SYSFB is enabled.
> > > But since OpenHCL in VTL2 gets its firmware info via OF instead of ACPI,
> > > I'm unsure what the Hyper-V host tells it about available MMIO space,
> > > and whether that space includes MMIO space for a framebuffer. If it
> > > doesn't, then OpenHCL won't have the problem I describe above, and
> > > it won't need CONFIG_SYSFB. This patch could be modified to do
> > >
> > > select SYSFB if !HYPERV_VTL_MODE
> > 
> > I am worried that this is not very scalable, there could be more such
> > Hyper-V systems in future.
> 
> I could see scalability being a problem if there were 20 more such
> Hyper-V systems in the future. But if there are just 2 or 3 more, that
> seems like it would be manageable.
> 
> Regardless, I'm OK with doing this with or without the
> "if !HYPERV_VTL_MODE". I don't think we should just drop this
> entirely. When playing around with various framebuffers drivers
> a few weeks back, I personally encountered the problem of having
> built a kernel that wouldn't boot in an Azure VM with an NVMe OS
> disk. I couldn't figure out why probing the NVMe controller failed.
> It took me an hour to sort out what was happening, and I was
> familiar with the Hyper-V PCI driver. I'd like to prevent such a
> problem from happening to someone else.

I agree we want to fix this.

> 
> > 
> > >
> > > Can you find out what MMIO space Hyper-V provides to VTL2 via OF?
> > > It would make sense if no framebuffer is provided. And maybe
> > > screen_info itself is not set up when VTL2 is loaded, which would
> > > also make adding CONFIG_SYSFB pointless for VTL2.
> > 
> > I can only see below address range passed for MMIO to VMBus driver:
> > ranges = <0x0f 0xf0000000 0x0f 0xf0000000 0x10000000>;
> 
> I'm guessing the above text is what shows up in DT?  I'm not sure
> how to interpret it. In normal guests, Hyper-V offers a "low MMIO"
> range that is below the 4 GiB line, and a "high MMIO" range that
> is just before the 64 GiB line. In a normal guest in Azure, I see the
> MLX driver using 0xfc0000000, which would be just below the 64 GiB
> line, and in the "high MMIO" range. The "0x0f 0xf0000000" in DT might
> be physical address 0xff0000000, which is consistent with the
> "high MMIO" range.  I'm not sure how to interpret the second
> occurrence of "0xf 0xf0000000".  I'm guessing the 0x10000000
> (256 Mib) is the length of the available range, which would also
> make sense.
> 
> The framebuffer address is always in the "low MMIO" range. So
> if my interpretation is anywhere close to correct, DT isn't
> specifying any MMIO space for a framebuffer, and there's
> no need for CONFIG_SYSFB in a kernel running in VTL2.
> 
> What's your preference for how to proceed? Adding CONFIG_SYSFB
> probably *will* increase the kernel code size, but I don't know
> by how much. I can do a measurement.

My primary preference is to ensure that OpenHCL remains unaffected.
And since there are no better alternatives I can think of, I am fine
proceeding with !HYPERV_VTL_MODE

with that,
Reviewed-by: Saurabh Sengar <ssen...@linux.microsoft.com>

- Saurabh

Reply via email to