On Tue, May 09, 2023 at 09:36:45PM +0200, Salvatore Bonaccorso wrote: > Control: tags -1 + moreinfo > > Hi Jared, > > On Mon, May 08, 2023 at 11:50:21PM -0600, Jared Epp wrote: > > Package: src:linux > > Version: 5.10.178-3 > > Severity: normal > > X-Debbugs-Cc: jared...@pm.me > > > > Dear Maintainer, > > > > After I updated my Debian 11 host kernel to 5.10.0-22, my VM guest > > (Windows 10 using KVM / qemu / libvirt) no longer boots and there's > > a kernel null pointer dereference along with a call trace, etc. in > > the system log. If I reboot and choose 5.10.0-21 in grub, the VM > > works as expected and there's no error in the log. > > > > Below, reportbug included part of the kernel log but it missed part > > of the problem so I pasted that in, I hope that's okay. If you need > > any other information let me know. > > > > Thanks > > > > Jared Epp > > > > -- Package-specific info: > > ** Version: > > Linux version 5.10.0-22-amd64 (debian-kernel@lists.debian.org) (gcc-10 > > (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) > > #1 SMP Debian 5.10.178-3 (2023-04-22) > > > > ** Command line: > > BOOT_IMAGE=/vmlinuz-5.10.0-22-amd64 root=/dev/mapper/panthro--vg-root ro > > quiet mem_sleep_default=s2idle default_hugepagesz=1G hugepages=8 > > > > ** Tainted: D (128) > > * kernel died recently, i.e. there was an OOPS or BUG > > > > ** Kernel log: > > [ 51.576266] BUG: kernel NULL pointer dereference, address: > > 0000000000000000 > > [ 51.576269] #PF: supervisor read access in kernel mode > > [ 51.576270] #PF: error_code(0x0000) - not-present page > > [ 51.576271] PGD 0 P4D 0 > > [ 51.576273] Oops: 0000 [#1] SMP NOPTI > > [ 51.576275] CPU: 6 PID: 2209 Comm: CPU 0/KVM Not tainted 5.10.0-22-amd64 > > #1 Debian 5.10.178-3 > > [ 51.576276] Hardware name: ASUS System Product Name/CROSSHAIR VI HERO, > > BIOS 8701 02/08/2023 > > [ 51.576280] RIP: 0010:find_first_bit+0x19/0x40 > > [ 51.576281] Code: 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc cc cc 49 > > 89 f0 48 85 f6 74 28 31 c0 eb 0d 48 83 c0 40 48 83 c7 08 4c 39 c0 73 17 > > <48> 8b 17 48 85 d2 74 eb f3 48 0f bc d2 48 01 d0 49 39 c0 4c 0f 47 > > [ 51.576282] RSP: 0018:ffffa99ac3a23a30 EFLAGS: 00010246 > > [ 51.576283] RAX: 0000000000000000 RBX: ffffa99ac38a5000 RCX: > > 0000000000000000 > > [ 51.576283] RDX: 0000000000000000 RSI: 0000000000000120 RDI: > > 0000000000000000 > > [ 51.576284] RBP: 0000000000000000 R08: 0000000000000120 R09: > > ffff94e2c1ae72a8 > > [ 51.576284] R10: 000000000000000f R11: 0000000000000000 R12: > > ffff94e2c1ae72a8 > > [ 51.576285] R13: 0000000000000323 R14: 0000000000000003 R15: > > 0000000000000006 > > [ 51.576286] FS: 0000000000000000(0053) GS:ffff94e89e980000(002b) > > knlGS:fffff8033f006000 > > [ 51.576286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 51.576287] CR2: 0000000000000000 CR3: 000000018e4ee000 CR4: > > 0000000000750ee0 > > [ 51.576287] PKRU: 55555554 > > [ 51.576288] Call Trace: > > [ 51.576307] kvm_make_vcpus_request_mask+0x38/0xf0 [kvm] > > [ 51.576319] kvm_hv_flush_tlb+0x147/0x370 [kvm] > > [ 51.576328] ? kvm_page_track_is_active+0x12/0x50 [kvm] > > [ 51.576336] ? make_spte+0x146/0x260 [kvm] > > [ 51.576344] ? mmu_spte_update+0x11/0x1c0 [kvm] > > [ 51.576351] ? set_spte+0xee/0x140 [kvm] > > [ 51.576358] ? mmu_set_spte+0x327/0x4a0 [kvm] > > [ 51.576365] ? kvm_release_pfn_clean+0x22/0x40 [kvm] > > [ 51.576372] ? direct_page_fault+0x223/0xa20 [kvm] > > [ 51.576374] ? svm_get_segment+0x18/0x100 [kvm_amd] > > [ 51.576382] ? kvm_get_cs_db_l_bits+0x35/0x70 [kvm] > > [ 51.576383] ? svm_get_segment+0x18/0x100 [kvm_amd] > > [ 51.576390] ? kvm_get_cs_db_l_bits+0x35/0x70 [kvm] > > [ 51.576398] kvm_hv_hypercall+0x176/0x580 [kvm] > > [ 51.576401] ? get_cpu_vendor+0x40/0xa0 > > [ 51.576403] ? native_load_tr_desc+0x67/0x70 > > [ 51.576411] kvm_arch_vcpu_ioctl_run+0xbe8/0x1740 [kvm] > > [ 51.576419] kvm_vcpu_ioctl+0x21e/0x5b0 [kvm] > > [ 51.576422] __x64_sys_ioctl+0x8b/0xc0 > > [ 51.576424] do_syscall_64+0x33/0x80 > > [ 51.576426] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > > [ 51.576428] RIP: 0033:0x7fad816f2237 > > [ 51.576429] Code: 00 00 00 48 8b 05 59 cc 0d 00 64 c7 00 26 00 00 00 48 > > c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 > > <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 29 cc 0d 00 f7 d8 64 89 01 48 > > [ 51.576429] RSP: 002b:00007fad7ce65508 EFLAGS: 00000246 ORIG_RAX: > > 0000000000000010 > > [ 51.576430] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: > > 00007fad816f2237 > > [ 51.576431] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: > > 000000000000001c > > [ 51.576431] RBP: 000055a3e17511c0 R08: 000055a3df109848 R09: > > 000055a3df5335c0 > > [ 51.576432] R10: 0000000000000000 R11: 0000000000000246 R12: > > 0000000000000000 > > [ 51.576432] R13: 000055a3df54fbc0 R14: 00007fad7ce657c0 R15: > > 0000000000802000 > > [ 51.576434] Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE > > nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter vhost_net > > vhost vhost_iotlb tap tun bridge stp llc overlay ip6t_REJECT nf_reject_ipv6 > > xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_multiport nft_limit > > snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio > > snd_hda_codec_hdmi snd_hda_intel nls_ascii snd_intel_dspcfg nls_cp437 > > soundwire_intel vfat soundwire_generic_allocation fat snd_soc_core > > snd_compress soundwire_cadence snd_hda_codec edac_mce_amd xt_limit > > xt_addrtype kvm_amd snd_hda_core xt_tcpudp snd_hwdep eeepc_wmi kvm > > soundwire_bus xpad xt_conntrack cdc_acm joydev asus_wmi ff_memless snd_pcm > > nf_conntrack battery sparse_keymap snd_timer nf_defrag_ipv6 rfkill > > nf_defrag_ipv4 irqbypass nft_compat snd video rapl efi_pstore wmi_bmof > > pcspkr ccp soundcore k10temp sp5100_tco nft_counter watchdog sg tpm_crb > > tpm_tis tpm_tis_core tpm rng_core acpi_cpufreq evdev nf_tables libcrc32c > > nfnetlink msr fuse > > [ 51.576464] configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 > > mbcache jbd2 crc32c_generic dm_crypt dm_mod hid_logitech_hidpp > > hid_logitech_dj amdgpu sr_mod cdrom gpu_sched sd_mod hid_generic ttm > > crc32_pclmul crc32c_intel usbhid hid drm_kms_helper ahci cec libahci > > ghash_clmulni_intel xhci_pci libata xhci_hcd drm nvme aesni_intel mxm_wmi > > igb libaes usbcore crypto_simd nvme_core cryptd scsi_mod glue_helper > > i2c_piix4 dca ptp pps_core t10_pi i2c_algo_bit crc_t10dif crct10dif_generic > > usb_common crct10dif_pclmul crct10dif_common wmi gpio_amdpt gpio_generic > > button > > [ 51.576484] CR2: 0000000000000000 > > [ 51.576485] ---[ end trace acfac62cc884c67c ]--- > > [ 51.668091] pstore: crypto_comp_compress failed, ret = -22! > > [ 51.682455] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state > > [ 51.682459] br-b8df22c12cd5: port 4(vethe67c4df) entered disabled state > > [ 51.682501] device vethe67c4df entered promiscuous mode > > [ 51.689861] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state > > [ 51.689863] br-b8df22c12cd5: port 4(vethe67c4df) entered forwarding state > > [ 51.696372] RIP: 0010:find_first_bit+0x19/0x40 > > [ 51.696374] Code: 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc cc cc 49 > > 89 f0 48 85 f6 74 28 31 c0 eb 0d 48 83 c0 40 48 83 c7 08 4c 39 c0 73 17 > > <48> 8b 17 48 85 d2 74 eb f3 48 0f bc d2 48 01 d0 49 39 c0 4c 0f 47 > > [ 51.696376] RSP: 0018:ffffa99ac3a23a30 EFLAGS: 00010246 > > [ 51.696378] RAX: 0000000000000000 RBX: ffffa99ac38a5000 RCX: > > 0000000000000000 > > [ 51.696379] RDX: 0000000000000000 RSI: 0000000000000120 RDI: > > 0000000000000000 > > [ 51.696380] RBP: 0000000000000000 R08: 0000000000000120 R09: > > ffff94e2c1ae72a8 > > [ 51.696380] R10: 000000000000000f R11: 0000000000000000 R12: > > ffff94e2c1ae72a8 > > [ 51.696381] R13: 0000000000000323 R14: 0000000000000003 R15: > > 0000000000000006 > > [ 51.696383] FS: 0000000000000000(0053) GS:ffff94e89e980000(002b) > > knlGS:fffff8033f006000 > > [ 51.696384] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 51.696384] CR2: 0000000000000000 CR3: 000000018e4ee000 CR4: > > 0000000000750ee0 > > [ 51.696385] PKRU: 55555554 > > [ 51.700146] br-924f74569f8a: port 4(veth5e7362a) entered blocking state > > [ 51.700151] br-924f74569f8a: port 4(veth5e7362a) entered disabled state > > [ 51.700200] device veth5e7362a entered promiscuous mode > > [ 51.700257] br-924f74569f8a: port 4(veth5e7362a) entered blocking state > > [ 51.700259] br-924f74569f8a: port 4(veth5e7362a) entered forwarding state > > [ 51.787480] eth0: renamed from veth28831ed > > [ 51.831676] br-b8df22c12cd5: port 4(vethe67c4df) entered disabled state > > [ 51.831721] br-924f74569f8a: port 4(veth5e7362a) entered disabled state > > [ 51.831741] IPv6: ADDRCONF(NETDEV_CHANGE): veth25ca080: link becomes > > ready > > [ 51.831758] br-924f74569f8a: port 2(veth25ca080) entered blocking state > > [ 51.831759] br-924f74569f8a: port 2(veth25ca080) entered forwarding state > > [ 51.832280] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state > > [ 51.832282] br-b8df22c12cd5: port 5(vethc8c90d8) entered disabled state > > [ 51.832329] device vethc8c90d8 entered promiscuous mode > > [ 51.832383] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state > > [ 51.832385] br-b8df22c12cd5: port 5(vethc8c90d8) entered forwarding state > > [ 51.832416] br-924f74569f8a: port 5(vethfbf8266) entered blocking state > > [ 51.832418] br-924f74569f8a: port 5(vethfbf8266) entered disabled state > > [ 51.832452] device vethfbf8266 entered promiscuous mode > > [ 51.832503] br-924f74569f8a: port 5(vethfbf8266) entered blocking state > > [ 51.832504] br-924f74569f8a: port 5(vethfbf8266) entered forwarding state > > [ 51.955355] eth0: renamed from vethec69e8f > > [ 51.999437] eth0: renamed from veth30923c8 > > [ 52.043965] br-b8df22c12cd5: port 5(vethc8c90d8) entered disabled state > > [ 52.044034] br-924f74569f8a: port 5(vethfbf8266) entered disabled state > > [ 52.044064] IPv6: ADDRCONF(NETDEV_CHANGE): veth1661ccb: link becomes > > ready > > [ 52.044086] br-924f74569f8a: port 1(veth1661ccb) entered blocking state > > [ 52.044088] br-924f74569f8a: port 1(veth1661ccb) entered forwarding state > > [ 52.044108] IPv6: ADDRCONF(NETDEV_CHANGE): veth0ebab0a: link becomes > > ready > > [ 52.044125] br-b8df22c12cd5: port 1(veth0ebab0a) entered blocking state > > [ 52.044127] br-b8df22c12cd5: port 1(veth0ebab0a) entered forwarding state > > [ 52.044539] br-924f74569f8a: port 6(veth5372881) entered blocking state > > [ 52.044542] br-924f74569f8a: port 6(veth5372881) entered disabled state > > [ 52.044586] device veth5372881 entered promiscuous mode > > [ 52.044644] br-924f74569f8a: port 6(veth5372881) entered blocking state > > [ 52.044646] br-924f74569f8a: port 6(veth5372881) entered forwarding state > > [ 52.057025] br-924f74569f8a: port 7(veth29a4d93) entered blocking state > > [ 52.057028] br-924f74569f8a: port 7(veth29a4d93) entered disabled state > > [ 52.057108] device veth29a4d93 entered promiscuous mode > > [ 52.057175] br-924f74569f8a: port 7(veth29a4d93) entered blocking state > > [ 52.057176] br-924f74569f8a: port 7(veth29a4d93) entered forwarding state > > [ 52.231474] eth0: renamed from veth9d75af1 > > [ 52.255847] br-924f74569f8a: port 6(veth5372881) entered disabled state > > [ 52.255889] br-924f74569f8a: port 7(veth29a4d93) entered disabled state > > [ 52.255906] IPv6: ADDRCONF(NETDEV_CHANGE): veth5e7362a: link becomes > > ready > > [ 52.255928] br-924f74569f8a: port 4(veth5e7362a) entered blocking state > > [ 52.255929] br-924f74569f8a: port 4(veth5e7362a) entered forwarding state > > [ 52.347639] eth0: renamed from veth0803979 > > [ 52.419482] eth0: renamed from veth7d2c7ef > > [ 52.508188] eth1: renamed from vethee5bda2 > > [ 52.567361] IPv6: ADDRCONF(NETDEV_CHANGE): veth74b84d1: link becomes > > ready > > [ 52.567383] br-b8df22c12cd5: port 3(veth74b84d1) entered blocking state > > [ 52.567385] br-b8df22c12cd5: port 3(veth74b84d1) entered forwarding state > > [ 52.567531] IPv6: ADDRCONF(NETDEV_CHANGE): veth226976d: link becomes > > ready > > [ 52.567550] br-b8df22c12cd5: port 2(veth226976d) entered blocking state > > [ 52.567551] br-b8df22c12cd5: port 2(veth226976d) entered forwarding state > > [ 52.567730] IPv6: ADDRCONF(NETDEV_CHANGE): veth5372881: link becomes > > ready > > [ 52.567751] br-924f74569f8a: port 6(veth5372881) entered blocking state > > [ 52.567752] br-924f74569f8a: port 6(veth5372881) entered forwarding state > > [ 52.615463] eth1: renamed from vethef7b9c6 > > [ 52.643620] IPv6: ADDRCONF(NETDEV_CHANGE): vethfbf8266: link becomes > > ready > > [ 52.643649] br-924f74569f8a: port 5(vethfbf8266) entered blocking state > > [ 52.643652] br-924f74569f8a: port 5(vethfbf8266) entered forwarding state > > [ 52.690903] eth0: renamed from veth3feaf4a > > [ 52.723412] IPv6: ADDRCONF(NETDEV_CHANGE): vethc8c90d8: link becomes > > ready > > [ 52.723440] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state > > [ 52.723442] br-b8df22c12cd5: port 5(vethc8c90d8) entered forwarding state > > [ 52.847525] eth1: renamed from vethd1e1e47 > > [ 52.877625] IPv6: ADDRCONF(NETDEV_CHANGE): veth888282e: link becomes > > ready > > [ 52.877656] br-924f74569f8a: port 3(veth888282e) entered blocking state > > [ 52.877660] br-924f74569f8a: port 3(veth888282e) entered forwarding state > > [ 52.877676] eth0: renamed from veth374d91f > > [ 52.939600] IPv6: ADDRCONF(NETDEV_CHANGE): vethe67c4df: link becomes > > ready > > [ 52.939638] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state > > [ 52.939641] br-b8df22c12cd5: port 4(vethe67c4df) entered forwarding state > > [ 52.945134] eth1: renamed from veth583ed90 > > [ 52.971727] IPv6: ADDRCONF(NETDEV_CHANGE): veth29a4d93: link becomes > > ready > > [ 52.971776] br-924f74569f8a: port 7(veth29a4d93) entered blocking state > > [ 52.971778] br-924f74569f8a: port 7(veth29a4d93) entered forwarding state > > [ 60.145487] logitech-hidpp-device 0003:046D:4031.0007: HID++ 2.0 device > > connected. > > [ 63.847187] bridge0: port 1(vnet0) entered learning state > > [ 65.643192] bridge0: port 2(enp5s0) entered learning state > > [ 67.654690] kauditd_printk_skb: 50 callbacks suppressed > > [ 67.654691] audit: type=1400 audit(1683610301.229:62): apparmor="DENIED" > > operation="capable" profile="libvirtd" pid=1754 comm="prio-rpc-worker" > > capability=17 capname="sys_rawio" > > [ 78.951195] bridge0: port 1(vnet0) entered forwarding state > > [ 78.951211] bridge0: topology change detected, propagating > > [ 80.743422] bridge0: port 2(enp5s0) entered forwarding state > > [ 80.743435] bridge0: topology change detected, propagating > > This sounds similar to the > https://forum.proxmox.com/threads/with-latest-5-15-104-1-pve-windows-server-vm-freeze-stuck.125294/ > issue. Would you be able to verify two things: > > Check how the Windows VM is configured and if you pass the > '+hv-tlbflush' flag. > > Additionally, would the attached patch make the issue go away?
Now with patch attached. Regards, Salvatore
>From b7f8d59d71742cf2b0553c042560790532bda41a Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov <vkuzn...@redhat.com> Date: Fri, 3 Sep 2021 09:51:36 +0200 Subject: [PATCH] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL In preparation to making kvm_make_vcpus_request_mask() use for_each_set_bit() switch kvm_hv_flush_tlb() to calling kvm_make_all_cpus_request() for 'all cpus' case. Note: kvm_make_all_cpus_request() (unlike kvm_make_vcpus_request_mask()) currently dynamically allocates cpumask on each call and this is suboptimal. Both kvm_make_all_cpus_request() and kvm_make_vcpus_request_mask() are going to be switched to using pre-allocated per-cpu masks. Reviewed-by: Sean Christopherson <sea...@google.com> Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> Message-Id: <20210903075141.403071-4-vkuzn...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- arch/x86/kvm/hyperv.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 09ec1cda2d68..e03e320847cd 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1562,16 +1562,19 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa, cpumask_clear(&hv_vcpu->tlb_flush); - vcpu_mask = all_cpus ? NULL : - sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, - vp_bitmap, vcpu_bitmap); - /* * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't * analyze it here, flush TLB regardless of the specified address space. */ - kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST, - NULL, vcpu_mask, &hv_vcpu->tlb_flush); + if (all_cpus) { + kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST); + } else { + vcpu_mask = sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, + vp_bitmap, vcpu_bitmap); + + kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST, + NULL, vcpu_mask, &hv_vcpu->tlb_flush); + } ret_success: /* We always do full TLB flush, set rep_done = rep_cnt. */ -- 2.40.1