Booting up GSP with vGPU enabled is part of the first milestone (M1)
together with rust fwctl abstraction [1] and nova-core fwctl driver [2]
for upstream vGPU support, allowing us to validate the basic GSP boot flow
with vGPU enabled, upload vGPU types even before the remaining nova-core
dependencies are ready.

This RFC v2 series is rebased on top of john/nova-core-blackwell-v6
(linux 7.0.0-rc1) [3], with the ExtSriovCapability abstraction extracted
into a separate upstream patch [4] based on Garry's WIP work. It also adds
FSP PRC protocol support for querying vGPU mode on Blackwell+
architectures, along with documentation for the FSP/PRC interface.

Besides the mailing list review, all the patches of M1 will be maintained
and iterated on the zhi/vgpu-m1-staging branch [5] before all the
dependencies are solved.

v2:

- Adopt early-return style (Dirk).
- Add #ifndef CONFIG_PCI_IOV helper to fix compilation when
  CONFIG_PCI_IOV is disabled, per (Alex).
- Change return type from Result<i32> to Result<u16> to match the
  PCI spec field width, avoiding try_from at call sites.
- GspVfInfo changed to tuple struct (Alex).
- Use unconditional constructor with Option wrapping instead of
  bool parameter. (Alex)
- Use full initialization expression instead of mutating a zeroed
  value.
- Use .chain() pattern in GspSetSystemInfo::init() for optional
  vGPU info. (Alex)
- Eliminate all magic numbers: add vf_bar_is_64bit() and
  read_vf_bar64_addr() to ExtSriovCapability using PCI bindings
  constants (PCI_BASE_ADDRESS_MEM_TYPE_MASK, etc.).
- Use KVec<RegistryEntry> for dynamic registry entry construction
  instead of hardcoded array (Timur, Joel, Alexandre).
- Replace magic numbers 32/48 with named binding constants
  MAX_PARTITIONS_WITH_GFID_32VM / MAX_PARTITIONS_WITH_GFID from
  OpenRM (Alex).
- Use read_poll_timeout() instead of single read for scrubber
  completion check (Joel).
- Use dev instead of pdev.as_ref() in dev_dbg! (Dirk).
- Change scrubber trigger condition from vgpu_requested to
  fb_layout.wpr2_heap.len() > SZ_256M, checking actual heap size
  instead of vGPU flag. (Alex).

- New patches:

- Factor out common FSP message header, return response buffer to caller,
  add PRC protocol support for reading vGPU mode from FSP on Blackwell+,
- Add FSP/PRC protocol documentation. (Suggested by Joel)
- The [!UPSTREAM] config space access patch from v1 is replaced by the
  ExtSriovCapability abstraction submitted separately [4].

Dependencies Status
=============================

[1] rust: introduce abstractions for fwctl v3              - In review
[2] gpu: nova-core: add fwctl driver                       - In review
[4] rust: pci: add extended capability and SR-IOV support  - RFC
[6] I/O projection API (Gary Guo)                          - WIP
[7] gsp: add continuation record support v6                - In review
[8] gsp: add locking to Cmdq v4                            - In review
[9] gsp: add RM control command infrastructure             - In review

[1] 
https://lore.kernel.org/rust-for-linux/[email protected]/
[2] 
https://lore.kernel.org/rust-for-linux/[email protected]/
[3] https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v6
[4] 
https://lore.kernel.org/rust-for-linux/[email protected]/
[5] https://github.com/zhiwang-nvidia/nova-core/tree/zhi/vgpu-m1-staging
[6] https://github.com/nbdd0121/linux/tree/io_projection
[7] 
https://lore.kernel.org/nouveau/[email protected]/
[8] 
https://lore.kernel.org/nouveau/[email protected]/
[9] 
https://lore.kernel.org/nouveau/[email protected]/

Zhi Wang (10):
  rust: pci: expose sriov_get_totalvfs() helper
  gpu: nova-core: factor out common FSP message header
  gpu: nova-core: return FSP response buffer to caller
  gpu: nova-core: read vGPU mode from FSP via PRC protocol
  gpu: nova-core: add FSP and PRC protocol documentation
  gpu: nova-core: introduce vgpu_support module param
  gpu: nova-core: populate GSP_VF_INFO when vGPU is enabled
  gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled
  gpu: nova-core: reserve a larger GSP WPR2 heap when vGPU is enabled
  gpu: nova-core: load the scrubber ucode when vGPU support is enabled

 Documentation/gpu/nova/core/fsp.rst           | 135 ++++++++++
 Documentation/gpu/nova/index.rst              |   1 +
 drivers/gpu/nova-core/fb.rs                   |  21 +-
 drivers/gpu/nova-core/firmware.rs             |   3 +-
 drivers/gpu/nova-core/firmware/booter.rs      |   2 +
 drivers/gpu/nova-core/fsp.rs                  | 231 +++++++++++++++---
 drivers/gpu/nova-core/gpu.rs                  |  48 +++-
 drivers/gpu/nova-core/gsp.rs                  |  32 ++-
 drivers/gpu/nova-core/gsp/boot.rs             | 133 +++++++---
 drivers/gpu/nova-core/gsp/commands.rs         | 105 +++++---
 drivers/gpu/nova-core/gsp/fw.rs               |  90 ++++++-
 drivers/gpu/nova-core/gsp/fw/commands.rs      |  12 +-
 .../gpu/nova-core/gsp/fw/r570_144/bindings.rs |   4 +
 drivers/gpu/nova-core/mctp.rs                 |   2 +
 drivers/gpu/nova-core/nova_core.rs            |  15 ++
 drivers/gpu/nova-core/regs.rs                 |  12 +
 drivers/gpu/nova-core/vgpu.rs                 |  37 +++
 rust/helpers/pci.c                            |   7 +
 rust/kernel/pci.rs                            |  14 ++
 19 files changed, 797 insertions(+), 107 deletions(-)
 create mode 100644 Documentation/gpu/nova/core/fsp.rst
 create mode 100644 drivers/gpu/nova-core/vgpu.rs

-- 
2.51.0

Reply via email to