Booting up GSP with vGPU enabled is part of the first milestone (M1) together with rust fwctl abstraction [1] and nova-core fwctl driver [2] for upstream vGPU support, allowing us to validate the basic GSP boot flow with vGPU enabled, upload vGPU types even before the remaining nova-core dependencies are ready.
This RFC v2 series is rebased on top of john/nova-core-blackwell-v6 (linux 7.0.0-rc1) [3], with the ExtSriovCapability abstraction extracted into a separate upstream patch [4] based on Garry's WIP work. It also adds FSP PRC protocol support for querying vGPU mode on Blackwell+ architectures, along with documentation for the FSP/PRC interface. Besides the mailing list review, all the patches of M1 will be maintained and iterated on the zhi/vgpu-m1-staging branch [5] before all the dependencies are solved. v2: - Adopt early-return style (Dirk). - Add #ifndef CONFIG_PCI_IOV helper to fix compilation when CONFIG_PCI_IOV is disabled, per (Alex). - Change return type from Result<i32> to Result<u16> to match the PCI spec field width, avoiding try_from at call sites. - GspVfInfo changed to tuple struct (Alex). - Use unconditional constructor with Option wrapping instead of bool parameter. (Alex) - Use full initialization expression instead of mutating a zeroed value. - Use .chain() pattern in GspSetSystemInfo::init() for optional vGPU info. (Alex) - Eliminate all magic numbers: add vf_bar_is_64bit() and read_vf_bar64_addr() to ExtSriovCapability using PCI bindings constants (PCI_BASE_ADDRESS_MEM_TYPE_MASK, etc.). - Use KVec<RegistryEntry> for dynamic registry entry construction instead of hardcoded array (Timur, Joel, Alexandre). - Replace magic numbers 32/48 with named binding constants MAX_PARTITIONS_WITH_GFID_32VM / MAX_PARTITIONS_WITH_GFID from OpenRM (Alex). - Use read_poll_timeout() instead of single read for scrubber completion check (Joel). - Use dev instead of pdev.as_ref() in dev_dbg! (Dirk). - Change scrubber trigger condition from vgpu_requested to fb_layout.wpr2_heap.len() > SZ_256M, checking actual heap size instead of vGPU flag. (Alex). - New patches: - Factor out common FSP message header, return response buffer to caller, add PRC protocol support for reading vGPU mode from FSP on Blackwell+, - Add FSP/PRC protocol documentation. (Suggested by Joel) - The [!UPSTREAM] config space access patch from v1 is replaced by the ExtSriovCapability abstraction submitted separately [4]. Dependencies Status ============================= [1] rust: introduce abstractions for fwctl v3 - In review [2] gpu: nova-core: add fwctl driver - In review [4] rust: pci: add extended capability and SR-IOV support - RFC [6] I/O projection API (Gary Guo) - WIP [7] gsp: add continuation record support v6 - In review [8] gsp: add locking to Cmdq v4 - In review [9] gsp: add RM control command infrastructure - In review [1] https://lore.kernel.org/rust-for-linux/[email protected]/ [2] https://lore.kernel.org/rust-for-linux/[email protected]/ [3] https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v6 [4] https://lore.kernel.org/rust-for-linux/[email protected]/ [5] https://github.com/zhiwang-nvidia/nova-core/tree/zhi/vgpu-m1-staging [6] https://github.com/nbdd0121/linux/tree/io_projection [7] https://lore.kernel.org/nouveau/[email protected]/ [8] https://lore.kernel.org/nouveau/[email protected]/ [9] https://lore.kernel.org/nouveau/[email protected]/ Zhi Wang (10): rust: pci: expose sriov_get_totalvfs() helper gpu: nova-core: factor out common FSP message header gpu: nova-core: return FSP response buffer to caller gpu: nova-core: read vGPU mode from FSP via PRC protocol gpu: nova-core: add FSP and PRC protocol documentation gpu: nova-core: introduce vgpu_support module param gpu: nova-core: populate GSP_VF_INFO when vGPU is enabled gpu: nova-core: set RMSetSriovMode when NVIDIA vGPU is enabled gpu: nova-core: reserve a larger GSP WPR2 heap when vGPU is enabled gpu: nova-core: load the scrubber ucode when vGPU support is enabled Documentation/gpu/nova/core/fsp.rst | 135 ++++++++++ Documentation/gpu/nova/index.rst | 1 + drivers/gpu/nova-core/fb.rs | 21 +- drivers/gpu/nova-core/firmware.rs | 3 +- drivers/gpu/nova-core/firmware/booter.rs | 2 + drivers/gpu/nova-core/fsp.rs | 231 +++++++++++++++--- drivers/gpu/nova-core/gpu.rs | 48 +++- drivers/gpu/nova-core/gsp.rs | 32 ++- drivers/gpu/nova-core/gsp/boot.rs | 133 +++++++--- drivers/gpu/nova-core/gsp/commands.rs | 105 +++++--- drivers/gpu/nova-core/gsp/fw.rs | 90 ++++++- drivers/gpu/nova-core/gsp/fw/commands.rs | 12 +- .../gpu/nova-core/gsp/fw/r570_144/bindings.rs | 4 + drivers/gpu/nova-core/mctp.rs | 2 + drivers/gpu/nova-core/nova_core.rs | 15 ++ drivers/gpu/nova-core/regs.rs | 12 + drivers/gpu/nova-core/vgpu.rs | 37 +++ rust/helpers/pci.c | 7 + rust/kernel/pci.rs | 14 ++ 19 files changed, 797 insertions(+), 107 deletions(-) create mode 100644 Documentation/gpu/nova/core/fsp.rst create mode 100644 drivers/gpu/nova-core/vgpu.rs -- 2.51.0
