This bug is awaiting verification that the linux-azure/5.15.0-1006.7
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy' to 'verification-done-jammy'. If the
problem still exists, change the tag 'verification-needed-jammy' to
'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1972662

Title:
  [Azure] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time

Status in linux-azure package in Ubuntu:
  Fix Released
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure source package in Impish:
  Fix Committed
Status in linux-azure source package in Jammy:
  Fix Committed

Bug description:
  SRU Justification

  [Impact]

  A VM on Azure can have 14 GPUs, and each GPU may have a huge MMIO BAR,
  e.g. 128 GB. Currently the boot time of such a VM can be 4+ minutes, and
  most of the time is used by the host to unmap/map the vBAR from/to pBAR
  when the VM clears and sets the PCI_COMMAND_MEMORY bit: each unmap/map
  operation for a 128GB BAR needs about 1.8 seconds, and the pci-hyperv
  driver and the Linux PCI subsystem flip the PCI_COMMAND_MEMORY bit
  eight times (see pci_setup_device() -> pci_read_bases() and
  pci_std_update_resource()), increasing the boot time by 1.8 * 8 = 14.4
  seconds per GPU, i.e. 14.4 * 14 = 201.6 seconds in total.

  Fix the slowness by not turning on the PCI_COMMAND_MEMORY in pci-hyperv.c,
  so the bit stays in the off state before the PCI device driver calls
  pci_enable_device(): when the bit is off, pci_read_bases() and
  pci_std_update_resource() don't cause Hyper-V to unmap/map the vBARs.
  With this change, the boot time of such a VM is reduced by
  1.8 * (8-1) * 14 = 176.4 seconds.

  [Test Case]

  Microsoft tested

  [Where things could go wrong]

  PCI BAR setup could fail or be incorrect.

  [Other Info]

  SF: #00336342

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1972662/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to