>>> >>> * It has been passing the tests with various combinations like 64KB >>> and 4KB page sizes on host and guest, different memory device >>> backends like normal, transparent huge page and HugeTLB, plus >>> migration. >> >> Perfect. A note that hugetlbfs isn't fully supported/safe to use until >> we have preallocation support in QEMU (WIP). >> > > Yes, there is some warnings raised to enlarge 'request-size' on > host with 64KB page size. Note that the memory backends I used > in the testings always have "prealloc=on" property though.
1. prealloc=on "prealloc=on" on the memory backend won't get the job done, because the first thing virtio-mem does is discard all memory in the memory backend again when it initializes. So it's an expensive NOP :) See https://lkml.kernel.org/r/20211130104136.40927-9-da...@redhat.com for the virtio-mem "prealloc=on" option that preallocates memory when exposing that memory to the VM. To use huge pages in a safe way with virtio-mem, we need "reserve=off" on the memory backend and "prealloc=on" on the virtio-mem device. (I'm in the process of documenting that on virtio-mem.gitlab.io/ to make it clearer) 2. Warning on arm64 with 64k I assume the warning you're seeing is regarding the block-size: "Read unsupported THP size: ..." followed by "Could not detect THP size, falling back to ..." The right thing to do for now should be to remove that sanity check: diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index d5a578142b..33c32afeb1 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -78,11 +78,8 @@ static uint32_t virtio_mem_thp_size(void) if (g_file_get_contents(HPAGE_PMD_SIZE_PATH, &content, NULL, NULL) && !qemu_strtou64(content, &endptr, 0, &tmp) && (!endptr || *endptr == '\n')) { - /* - * Sanity-check the value, if it's too big (e.g., aarch64 with 64k base - * pages) or weird, fallback to something smaller. - */ - if (!tmp || !is_power_of_2(tmp) || tmp > 16 * MiB) { + /* Sanity-check the value and fallback to something reasonable. */ + if (!tmp || !is_power_of_2(tmp)) { warn_report("Read unsupported THP size: %" PRIx64, tmp); } else { thp_size = tmp; This will not affect setups we care about ( x86-64 KVM ). It will imply that with a arm64 64k host, we can only hot(un)plug in 512 MiB granularity when not using hugetlb witht he default block-size. However, that is already the case with arm64 64k guests as well. The suggestion will be to use arm64 4k with virtio-mem in the host and the guest for increased flexibility -- fortunately most distros already have performed the switch to 4k on arm64, so we don't really care IMHO. To support block_size < THP size when not using hugetlb, we'll have to disable THP (via MADV_NOHUGEPAGE) permanently for the memory region, also making sure that e.g., postcopy won't re-enable it by adding a proper flag (RAM_NOHUGEPAGE) to the RAMBlock. Because the issue is that once the guest touches some memory, we might populate a THP that would cover plugged and unplugged memory, which is bad. Instead of warning in virtio_mem_device_realize() when vmem->block_size < virtio_mem_default_block_size(rb) we'd have to disable THP. Further, we should fixup the default THP size on arm64 in case we're running on an older kernel where we cannot sense the THP size: diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index d5a578142b..371cee380a 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -38,13 +38,21 @@ */ #define VIRTIO_MEM_MIN_BLOCK_SIZE ((uint32_t)(1 * MiB)) +static uint32_t virtio_mem_default_thp_size(void) +{ +#if defined(__aarch64__) + if (qemu_real_host_page_size == 64 * KiB) { + return 512 * MiB; + } +#endif #if defined(__x86_64__) || defined(__arm__) || defined(__aarch64__) || \ defined(__powerpc64__) -#define VIRTIO_MEM_DEFAULT_THP_SIZE ((uint32_t)(2 * MiB)) + return 2 * MiB; #else - /* fallback to 1 MiB (e.g., the THP size on s390x) */ -#define VIRTIO_MEM_DEFAULT_THP_SIZE VIRTIO_MEM_MIN_BLOCK_SIZE + /* fallback to 1 MiB (e.g., the THP size on s390x) */ + return VIRTIO_MEM_MIN_BLOCK_SIZE; #endif +} /* * We want to have a reasonable default block size such that @@ -90,7 +98,7 @@ static uint32_t virtio_mem_thp_size(void) } if (!thp_size) { - thp_size = VIRTIO_MEM_DEFAULT_THP_SIZE; + thp_size = virtio_mem_default_thp_size(); warn_report("Could not detect THP size, falling back to %" PRIx64 " MiB.", thp_size / MiB); } In the context of proper arm64 support, adding the above two changes should be good enough. If you agree, can you include them in your v2 series as a separate patch? Supporting block_size < thp_size when not using hugetlb is a different work IMHO, if someone ever cares about that. Thanks! -- Thanks, David / dhildenb