The MMIO region size required to support virtualized environments with large PCI BAR regions can exceed the hardcoded limit configured in QEMU. For example, a VM with multiple NVIDIA Grace-Hopper GPUs passed through requires more MMIO memory than the amount provided by VIRT_HIGH_PCIE_MMIO (currently 512GB). Instead of updating VIRT_HIGH_PCIE_MMIO, introduce a new parameter, highmem-mmio-size, that specifies the MMIO size required to support the VM configuration.
Example usage with 1TB MMIO region size: -machine virt,gic-version=3,highmem-mmio-size=1T Signed-off-by: Matthew R. Ochs <mo...@nvidia.com> Reviewed-by: Gavin Shan <gs...@redhat.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.th...@huawei.com> Reviewed-by: Eric Auger <eric.au...@redhat.com> Reviewed-by: Nicolin Chen <nicol...@nvidia.com> --- v6: - Fixed minor coding style nit v5: - Removed hyphens from power of 2 - Consistently use property name in all error messages - Use #defines for default high PCIE MMIO size - Use size_to_str() when printing size values - Add comment clarifying that highmem-mmio-size will update the corresponding value in extended_memmap v4: - Added default size to highmem-mmio-size description v3: - Updated highmem-mmio-size description v2: - Add unit suffix to example in commit message - Use existing "high memory region" terminology - Resolve minor braces nit docs/system/arm/virt.rst | 4 ++++ hw/arm/virt.c | 52 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst index 0c9c2ce0351c..adf446c0a295 100644 --- a/docs/system/arm/virt.rst +++ b/docs/system/arm/virt.rst @@ -144,6 +144,10 @@ highmem-mmio Set ``on``/``off`` to enable/disable the high memory region for PCI MMIO. The default is ``on``. +highmem-mmio-size + Set the high memory region size for PCI MMIO. Must be a power of 2 and + greater than or equal to the default size (512G). + gic-version Specify the version of the Generic Interrupt Controller (GIC) to provide. Valid values are: diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 4a5a9666e916..ee69081ef421 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -53,6 +53,7 @@ #include "hw/loader.h" #include "qapi/error.h" #include "qemu/bitops.h" +#include "qemu/cutils.h" #include "qemu/error-report.h" #include "qemu/module.h" #include "hw/pci-host/gpex.h" @@ -192,6 +193,10 @@ static const MemMapEntry base_memmap[] = { [VIRT_MEM] = { GiB, LEGACY_RAMLIMIT_BYTES }, }; +/* Update the docs for highmem-mmio-size when changing this default */ +#define DEFAULT_HIGH_PCIE_MMIO_SIZE_GB 512 +#define DEFAULT_HIGH_PCIE_MMIO_SIZE (DEFAULT_HIGH_PCIE_MMIO_SIZE_GB * GiB) + /* * Highmem IO Regions: This memory map is floating, located after the RAM. * Each MemMapEntry base (GPA) will be dynamically computed, depending on the @@ -207,13 +212,16 @@ static const MemMapEntry base_memmap[] = { * PA space for one specific region is always reserved, even if the region * has been disabled or doesn't fit into the PA space. However, the PA space * for the region won't be reserved in these circumstances with compact layout. + * + * Note that the highmem-mmio-size property will update the high PCIE MMIO size + * field in this array. */ static MemMapEntry extended_memmap[] = { /* Additional 64 MB redist region (can contain up to 512 redistributors) */ [VIRT_HIGH_GIC_REDIST2] = { 0x0, 64 * MiB }, [VIRT_HIGH_PCIE_ECAM] = { 0x0, 256 * MiB }, /* Second PCIe window */ - [VIRT_HIGH_PCIE_MMIO] = { 0x0, 512 * GiB }, + [VIRT_HIGH_PCIE_MMIO] = { 0x0, DEFAULT_HIGH_PCIE_MMIO_SIZE }, }; static const int a15irqmap[] = { @@ -2550,6 +2558,40 @@ static void virt_set_highmem_mmio(Object *obj, bool value, Error **errp) vms->highmem_mmio = value; } +static void virt_get_highmem_mmio_size(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + uint64_t size = extended_memmap[VIRT_HIGH_PCIE_MMIO].size; + + visit_type_size(v, name, &size, errp); +} + +static void virt_set_highmem_mmio_size(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + uint64_t size; + + if (!visit_type_size(v, name, &size, errp)) { + return; + } + + if (!is_power_of_2(size)) { + error_setg(errp, "highmem-mmio-size is not a power of 2"); + return; + } + + if (size < DEFAULT_HIGH_PCIE_MMIO_SIZE) { + char *sz = size_to_str(DEFAULT_HIGH_PCIE_MMIO_SIZE); + error_setg(errp, "highmem-mmio-size cannot be set to a lower value " + "than the default (%s)", sz); + g_free(sz); + return; + } + + extended_memmap[VIRT_HIGH_PCIE_MMIO].size = size; +} static bool virt_get_its(Object *obj, Error **errp) { @@ -3207,6 +3249,14 @@ static void virt_machine_class_init(ObjectClass *oc, void *data) "Set on/off to enable/disable high " "memory region for PCI MMIO"); + object_class_property_add(oc, "highmem-mmio-size", "size", + virt_get_highmem_mmio_size, + virt_set_highmem_mmio_size, + NULL, NULL); + object_class_property_set_description(oc, "highmem-mmio-size", + "Set the high memory region size " + "for PCI MMIO"); + object_class_property_add_str(oc, "gic-version", virt_get_gic_version, virt_set_gic_version); object_class_property_set_description(oc, "gic-version", -- 2.46.0