On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxian...@hisilicon.com> wrote: > > Hi Ard, > > > 在 2022/11/30 16:18, Ard Biesheuvel 写道: > > On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <m...@kernel.org> wrote: > >> On Wed, 30 Nov 2022 02:52:35 +0000, > >> "chenxiang (M)" <chenxian...@hisilicon.com> wrote: > >>> Hi, > >>> > >>> We boot the VM using following commands (with nvdimm on) (qemu > >>> version 6.1.50, kernel 6.0-r4): > >> How relevant is the presence of the nvdimm? Do you observe the failure > >> without this? > >> > >>> qemu-system-aarch64 -machine > >>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel > >>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios > >>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m > >>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 > >>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1' > >>> -object memory-backend-ram,id=ram1,size=10G -device > >>> nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 > >>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 > >>> > >>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel > >>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): > >>> > >>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko > >>> [ 8.186563] vmap allocation for size 20480 failed: use > >>> vmalloc=<size> to increase size > >> Have you tried increasing the vmalloc size to check that this is > >> indeed the problem? > >> > >> [...] > >> > >>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: > >>> defer initialization to initcall where permitted"). > >> I guess you mean commit fc5a89f75d2a instead, right? > >> > >>> Do you have any idea about the issue? > >> I sort of suspect that the nvdimm gets vmap-ed and consumes a large > >> portion of the vmalloc space, but you give very little information > >> that could help here... > >> > > Ouch. I suspect what's going on here: that patch defers the > > randomization of the module region, so that we can decouple it from > > the very early init code. > > > > Obviously, it is happening too late now, and the randomized module > > region is overlapping with a vmalloc region that is in use by the time > > the randomization occurs. > > > > Does the below fix the issue? > > The issue still occurs, but it seems decrease the probability, before it > occured almost every time, after the change, i tried 2-3 times, and it > occurs. > But i change back "subsys_initcall" to "core_initcall", and i test more > than 20 times, and it is still ok. >
Thank you for confirming. I will send out a patch today. > > > > diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c > > index 37a9deed2aec..71fb18b2f304 100644 > > --- a/arch/arm64/kernel/kaslr.c > > +++ b/arch/arm64/kernel/kaslr.c > > @@ -90,4 +90,4 @@ static int __init kaslr_init(void) > > > > return 0; > > } > > -subsys_initcall(kaslr_init) > > +arch_initcall(kaslr_init) > > . > > >