Hi Igor, On 2/22/19 5:27 PM, Igor Mammedov wrote: > On Wed, 20 Feb 2019 23:39:46 +0100 > Eric Auger <eric.au...@redhat.com> wrote: > >> This series aims to bump the 255GB RAM limit in machvirt and to >> support device memory in general, and especially PCDIMM/NVDIMM. >> >> In machvirt versions < 4.0, the initial RAM starts at 1GB and can >> grow up to 255GB. From 256GB onwards we find IO regions such as the >> additional GICv3 RDIST region, high PCIe ECAM region and high PCIe >> MMIO region. The address map was 1TB large. This corresponded to >> the max IPA capacity KVM was able to manage. >> >> Since 4.20, the host kernel is able to support a larger and dynamic >> IPA range. So the guest physical address can go beyond the 1TB. The >> max GPA size depends on the host kernel configuration and physical CPUs. >> >> In this series we use this feature and allow the RAM to grow without >> any other limit than the one put by the host kernel. >> >> The RAM still starts at 1GB. First comes the initial ram (-m) of size >> ram_size and then comes the device memory (,maxmem) of size >> maxram_size - ram_size. The device memory is potentially hotpluggable >> depending on the instantiated memory objects. >> >> IO regions previously located between 256GB and 1TB are moved after >> the RAM. Their offset is dynamically computed, depends on ram_size >> and maxram_size. Size alignment is enforced. >> >> In case maxmem value is inferior to 255GB, the legacy memory map >> still is used. The change of memory map becomes effective from 4.0 >> onwards. >> >> As we keep the initial RAM at 1GB base address, we do not need to do >> invasive changes in the EDK2 FW. It seems nobody is eager to do >> that job at the moment. >> >> Device memory being put just after the initial RAM, it is possible >> to get access to this feature while keeping a 1TB address map. >> >> This series reuses/rebases patches initially submitted by Shameer >> in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts. >> >> Functionally, the series is split into 3 parts: >> 1) bump of the initial RAM limit [1 - 9] and change in >> the memory map > >> 2) Support of PC-DIMM [10 - 13] > Is this part complete ACPI wise (for coldplug)? I haven't noticed > DSDT AML here no E820 changes, so ACPI wise pc-dimm shouldn't be > visible to the guest. It might be that DT is masking problem > but well, that won't work on ACPI only guests.
guest /proc/meminfo or "lshw -class memory" reflects the amount of mem added with the DIMM slots. So it looks fine to me. Isn't E820 a pure x86 matter? What else would you expect in the dsdt? I understand hotplug would require extra modifications but I don't see anything else missing for coldplug. > Even though I've tried make mem hotplug ACPI parts not x86 specific, > I'm afraid it might be tightly coupled with hotplug support. > So here are 2 options make DSDT part work without hotplug or > implement hotplug here. I think the former is just a waste of time > and we should just add hotplug. It should take relatively minor effort > since you already implemented most of boiler plate here. Shameer sent an RFC series for supporting hotplug. [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support https://patchwork.kernel.org/cover/10783589/ I tested PCDIMM hotplug (with ACPI) this afternoon and it seemed to be OK, even after system_reset. Note the hotplug kernel support on ARM is very recent. I would prefer to dissociate both efforts if we want to get a chance making coldplug for 4.0. Also we have an issue for NVDIMM since on reboot the guest does not boot properly. > > As for how to implement ACPI HW part, I suggest to borrow GED > device that NEMU guys trying to use instead of GPIO route, > like we do now for ACPI_POWER_BUTTON_DEVICE to deliver event. > So that it would be easier to share this with their virt-x86 > machine eventually. Sounds like a different approach than the one initiated by Shameer? Thanks Eric > > >> 3) Support of NV-DIMM [14 - 17] > The same might be true for NUMA but I haven't dug this deep in to > that part. > >> >> 1) can be upstreamed before 2 and 2 can be upstreamed before 3. >> >> Work is ongoing to transform the whole memory as device memory. >> However this move is not trivial and to me, is independent on >> the improvements brought by this series: >> - if we were to use DIMM for initial RAM, those DIMMs would use >> use slots. Although they would not be part of the ones provided >> using the ",slots" options, they are ACPI limited resources. >> - DT and ACPI description needs to be reworked >> - NUMA integration needs special care >> - a special device memory object may be required to avoid consuming >> slots and easing the FW description. >> >> So I preferred to separate the concerns. This new implementation >> based on device memory could be candidate for another virt >> version. >> >> Best Regards >> >> Eric >> >> References: >> >> [0] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions >> http://patchwork.ozlabs.org/cover/914694/ >> >> [1] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform >> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html >> >> This series can be found at: >> https://github.com/eauger/qemu/tree/v3.1.0-dimm-v7 >> >> History: >> >> v6 -> v7: >> - Addressed Peter and Igor comments (exceptions sent my email) >> - Fixed TCG case. Now device memory works also for TCG and vcpu >> pamax is checked >> - See individual logs for more details >> >> v5 -> v6: >> - mingw compilation issue fix >> - kvm_arm_get_max_vm_phys_shift always returns the number of supported >> IPA bits >> - new patch "hw/arm/virt: Rename highmem IO regions" that eases the review >> of "hw/arm/virt: Split the memory map description" >> - "hw/arm/virt: Move memory map initialization into machvirt_init" >> squashed into the previous patch >> - change alignment of IO regions beyond the RAM so that it matches their >> size >> >> v4 -> v5: >> - change in the memory map >> - see individual logs >> >> v3 -> v4: >> - rebase on David's "pc-dimm: next bunch of cleanups" and >> "pc-dimm: pre_plug "slot" and "addr" assignment" >> - kvm-type option not used anymore. We directly use >> maxram_size and ram_size machine fields to compute the >> MAX IPA range. Migration is naturally handled as CLI >> option are kept between source and destination. This was >> suggested by David. >> - device_memory_start and device_memory_size not stored >> anymore in vms->bootinfo >> - I did not take into account 2 Igor's comments: the one >> related to the refactoring of arm_load_dtb and the one >> related to the generation of the dtb after system_reset >> which would contain nodes of hotplugged devices (we do >> not support hotplug at this stage) >> - check the end-user does not attempt to hotplug a device >> - addition of "vl: Set machine ram_size, maxram_size and >> ram_slots earlier" >> >> v2 -> v3: >> - fix pc_q35 and pc_piix compilation error >> - kwangwoo's email being not valid anymore, remove his address >> >> v1 -> v2: >> - kvm_get_max_vm_phys_shift moved in arch specific file >> - addition of NVDIMM part >> - single series >> - rebase on David's refactoring >> >> v1: >> - was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size" >> - was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB" >> >> Best Regards >> >> Eric >> >> >> Eric Auger (12): >> hw/arm/virt: Rename highmem IO regions >> hw/arm/virt: Split the memory map description >> hw/boards: Add a MachineState parameter to kvm_type callback >> kvm: add kvm_arm_get_max_vm_ipa_size >> vl: Set machine ram_size, maxram_size and ram_slots earlier >> hw/arm/virt: Dynamic memory map depending on RAM requirements >> hw/arm/virt: Implement kvm_type function for 4.0 machine >> hw/arm/virt: Bump the 255GB initial RAM limit >> hw/arm/virt: Add memory hotplug framework >> hw/arm/virt: Allocate device_memory >> hw/arm/boot: Expose the pmem nodes in the DT >> hw/arm/virt: Add nvdimm and nvdimm-persistence options >> >> Kwangwoo Lee (2): >> nvdimm: use configurable ACPI IO base and size >> hw/arm/virt: Add nvdimm hot-plug infrastructure >> >> Shameer Kolothum (3): >> hw/arm/boot: introduce fdt_add_memory_node helper >> hw/arm/boot: Expose the PC-DIMM nodes in the DT >> hw/arm/virt-acpi-build: Add PC-DIMM in SRAT >> >> accel/kvm/kvm-all.c | 2 +- >> default-configs/arm-softmmu.mak | 4 + >> hw/acpi/nvdimm.c | 31 ++- >> hw/arm/boot.c | 136 ++++++++++-- >> hw/arm/virt-acpi-build.c | 23 +- >> hw/arm/virt.c | 364 ++++++++++++++++++++++++++++---- >> hw/i386/pc_piix.c | 6 +- >> hw/i386/pc_q35.c | 6 +- >> hw/ppc/mac_newworld.c | 3 +- >> hw/ppc/mac_oldworld.c | 2 +- >> hw/ppc/spapr.c | 2 +- >> include/hw/arm/virt.h | 24 ++- >> include/hw/boards.h | 5 +- >> include/hw/mem/nvdimm.h | 4 + >> target/arm/kvm.c | 10 + >> target/arm/kvm_arm.h | 13 ++ >> vl.c | 6 +- >> 17 files changed, 556 insertions(+), 85 deletions(-) >> > >