On Tue, May 28, 2024 at 09:35:22AM -0400, Peter Xu wrote: > On Tue, May 28, 2024 at 02:27:57PM +1000, Nicholas Piggin wrote: > > There is no need to use /dev/shm for file-backed memory devices, and > > it is too small to be usable in gitlab CI. Switch to using a regular > > file in /tmp/ which will usually have more space available. > > > > Signed-off-by: Nicholas Piggin <npig...@gmail.com> > > --- > > Am I missing something? AFAIKS there is not even any point using > > /dev/shm aka tmpfs anyway, there is not much special about it as a > > filesystem. This applies on top of the series just sent, and passes > > gitlab CI qtests including aarch64. > > I think it's just that /dev/shm guarantees shmem usage, while the var > "tmpfs" implies g_dir_make_tmp() which may be another non-ram based file > system, while that'll be slightly different comparing to what a real user > would use - we don't suggest user to put guest RAM on things like btrfs. > > One real implication is if we add a postcopy test it'll fail with > g_dir_make_tmp() when it is not pointing to a shmem mount, as > UFFDIO_REGISTER will fail there. But that test doesn't yet exist as the > QEMU paths should be the same even if Linux will trigger different paths > when different types of mem is used (anonymous v.s. shmem). > > If the goal here is to properly handle the case where tmpfs doesn't have > enough space, how about what I suggested in the other email? > > https://lore.kernel.org/r/ZlSppKDE6wzjCF--@x1n > > IOW, try populate the shmem region before starting the guest, skip if > population failed. Would that work?
Let me append some more info here.. I think madvise() isn't needed as fallocate() should do the population work already, afaiu, then it means we pass the shmem path to QEMU and QEMU should notice this memory-backend-file existed, open() directly. I quicked walk the QEMU memory code and so far it looks all applicable, so that QEMU should just start the guest with the pre-populated shmem page caches. There's one trick where qemu_ram_mmap() will map some extra pages, on x86 4k, and I don't yet know why we did that.. /* * Note: this always allocates at least one extra page of virtual address * space, even if size is already aligned. */ total = size + align; But that was only used in mmap_reserve() not mmap_activate(), so the real mmap() should still be exactly what we fallocate()ed. Thanks, -- Peter Xu