This bug was fixed in the package linux - 5.4.0-91.102 --------------- linux (5.4.0-91.102) focal; urgency=medium
* focal/linux: 5.4.0-91.102 -proposed tracker (LP: #1949840) * Packaging resync (LP: #1786013) - [Packaging] update Ubuntu.md - debian/dkms-versions -- update from kernel-versions (main/2021.11.08) * KVM emulation failure when booting into VM crash kernel with multiple CPUs (LP: #1948862) - KVM: x86: Properly reset MMU context at vCPU RESET/INIT * aufs: kernel bug with apparmor and fuseblk (LP: #1948470) - SAUCE: aufs: bugfix, stop omitting path->mnt * ebpf: bpf_redirect fails with ip6 gre interfaces (LP: #1947164) - net: handle ARPHRD_IP6GRE in dev_is_mac_header_xmit() * require CAP_NET_ADMIN to attach N_HCI ldisc (LP: #1949516) - Bluetooth: hci_ldisc: require CAP_NET_ADMIN to attach N_HCI ldisc * ACL updates on OCFS2 are not revalidated (LP: #1947161) - ocfs2: fix remounting needed after setfacl command * ppc64 BPF JIT mod by 1 will not return 0 (LP: #1948351) - powerpc/bpf: Fix BPF_MOD when imm == 1 * Drop "UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active" (LP: #1947709) - Revert "UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active" * Reassign I/O Path of ConnectX-5 Port 1 before Port 2 causes NULL dereference (LP: #1943464) - s390/pci: fix leak of PCI device structure - s390/pci: fix use after free of zpci_dev - s390/pci: fix zpci_zdev_put() on reserve * [SRU][F] USB: serial: pl2303: add support for PL2303HXN (LP: #1948377) - USB: serial: pl2303: add support for PL2303HXN - USB: serial: pl2303: fix line-speed handling on newer chips * Focal update: v5.4.151 upstream stable release (LP: #1947888) - tty: Fix out-of-bound vmalloc access in imageblit - cpufreq: schedutil: Use kobject release() method to free sugov_tunables - cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory - usb: cdns3: fix race condition before setting doorbell - fs-verity: fix signed integer overflow with i_size near S64_MAX - hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field - hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field - hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field - scsi: ufs: Fix illegal offset in UPIU event trace - mac80211: fix use-after-free in CCMP/GCMP RX - x86/kvmclock: Move this_cpu_pvti into kvmclock.h - drm/amd/display: Pass PCI deviceid into DC - ipvs: check that ip_vs_conn_tab_bits is between 8 and 20 - hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs - mac80211: Fix ieee80211_amsdu_aggregate frag_tail bug - mac80211: limit injected vht mcs/nss in ieee80211_parse_tx_radiotap - mac80211: mesh: fix potentially unaligned access - mac80211-hwsim: fix late beacon hrtimer handling - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb - hwmon: (tmp421) report /PVLD condition as fault - hwmon: (tmp421) fix rounding for negative values - net: ipv4: Fix rtnexthop len when RTA_FLOW is present - e100: fix length calculation in e100_get_regs_len - e100: fix buffer overrun in e100_get_regs - selftests, bpf: test_lwt_ip_encap: Really disable rp_filter - scsi: csiostor: Add module softdep on cxgb4 - net: hns3: do not allow call hns3_nic_net_open repeatedly - net: sched: flower: protect fl_walk() with rcu - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses - perf/x86/intel: Update event constraints for ICX - elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings - debugfs: debugfs_create_file_size(): use IS_ERR to check for error - ipack: ipoctal: fix stack information leak - ipack: ipoctal: fix tty registration race - ipack: ipoctal: fix tty-registration error handling - ipack: ipoctal: fix missing allocation-failure check - ipack: ipoctal: fix module reference leak - ext4: fix loff_t overflow in ext4_max_bitmap_size() - ext4: fix reserved space counter leakage - ext4: fix potential infinite loop in ext4_dx_readdir() - HID: u2fzero: ignore incomplete packets without data - net: udp: annotate data race around udp_sk(sk)->corkflag - net: stmmac: don't attach interface until resume finishes - PCI: Fix pci_host_bridge struct device release/free handling - libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind - hso: fix bailout in error case of probe - usb: hso: fix error handling code of hso_create_net_device - usb: hso: remove the bailout parameter - crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd() - HID: betop: fix slab-out-of-bounds Write in betop_probe - netfilter: ipset: Fix oversized kvmalloc() calls - HID: usbhid: free raw_report buffers in usbhid_stop - Linux 5.4.151 * Focal update: v5.4.150 upstream stable release (LP: #1947886) - usb: gadget: r8a66597: fix a loop in set_feature() - usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave - usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA - usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned() - cifs: fix incorrect check for null pointer in header_assemble - xen/x86: fix PV trap handling on secondary processors - usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c - USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter - USB: cdc-acm: fix minor-number release - binder: make sure fd closes complete - staging: greybus: uart: fix tty use after free - Re-enable UAS for LaCie Rugged USB3-FW with fk quirk - USB: serial: mos7840: remove duplicated 0xac24 device ID - USB: serial: option: add Telit LN920 compositions - USB: serial: option: remove duplicate USB device ID - USB: serial: option: add device id for Foxconn T99W265 - mcb: fix error handling in mcb_alloc_bus() - erofs: fix up erofs_lookup tracepoint - btrfs: prevent __btrfs_dump_space_info() to underflow its free space - serial: mvebu-uart: fix driver's tx_empty callback - net: hso: fix muxed tty registration - afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation - platform/x86/intel: punit_ipc: Drop wrong use of ACPI_PTR() - enetc: Fix illegal access when reading affinity_hint - bnxt_en: Fix TX timeout when TX ring size is set to the smallest - net/smc: add missing error check in smc_clc_prfx_set() - gpio: uniphier: Fix void functions to remove return value - qed: rdma - don't wait for resources under hw error recovery flow - net/mlx4_en: Don't allow aRFS for encapsulated packets - scsi: iscsi: Adjust iface sysfs attr detection - tty: synclink_gt, drop unneeded forward declarations - tty: synclink_gt: rename a conflicting function name - fpga: machxo2-spi: Return an error on failure - fpga: machxo2-spi: Fix missing error code in machxo2_write_complete() - thermal/core: Potential buffer overflow in thermal_build_list_of_policies() - cifs: fix a sign extension bug - scsi: qla2xxx: Restore initiator in dual mode - scsi: lpfc: Use correct scnprintf() limit - irqchip/goldfish-pic: Select GENERIC_IRQ_CHIP to fix build - irqchip/gic-v3-its: Fix potential VPE leak on error - md: fix a lock order reversal in md_alloc - blktrace: Fix uaf in blk_trace access after removing by sysfs - net: macb: fix use after free on rmmod - net: stmmac: allow CSR clock of 300MHz - m68k: Double cast io functions to unsigned long - ipv6: delay fib6_sernum increase in fib6_add - bpf: Add oversize check before call kvcalloc() - xen/balloon: use a kernel thread instead a workqueue - nvme-multipath: fix ANA state updates when a namespace is not present - sparc32: page align size in arch_dma_alloc - blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd - compiler.h: Introduce absolute_pointer macro - net: i825xx: Use absolute_pointer for memcpy from fixed memory location - sparc: avoid stringop-overread errors - qnx4: avoid stringop-overread errors - parisc: Use absolute_pointer() to define PAGE0 - arm64: Mark __stack_chk_guard as __ro_after_init - alpha: Declare virt_to_phys and virt_to_bus parameter as pointer to volatile - net: 6pack: Fix tx timeout and slot time - spi: Fix tegra20 build with CONFIG_PM=n - EDAC/synopsys: Fix wrong value type assignment for edac_mode - thermal/drivers/int340x: Do not set a wrong tcc offset on resume - arm64: dts: marvell: armada-37xx: Extend PCIe MEM space - xen/balloon: fix balloon kthread freezing - qnx4: work around gcc false positive warning bug - Linux 5.4.150 * ACL updates on OCFS2 are not revalidated (LP: #1947161) // Focal update: v5.4.150 upstream stable release (LP: #1947886) - ocfs2: drop acl cache for directories too * Focal update: v5.4.149 upstream stable release (LP: #1947885) - PCI: pci-bridge-emul: Fix big-endian support - PCI: aardvark: Indicate error in 'val' when config read fails - PCI: pci-bridge-emul: Add PCIe Root Capabilities Register - PCI: aardvark: Fix reporting CRS value - PCI/ACPI: Add Ampere Altra SOC MCFG quirk - KVM: remember position in kvm->vcpus array - console: consume APC, DM, DCS - s390/pci_mmio: fully validate the VMA before calling follow_pte() - ARM: Qualify enabling of swiotlb_init() - apparmor: remove duplicate macro list_entry_is_head() - ARM: 9077/1: PLT: Move struct plt_entries definition to header - ARM: 9078/1: Add warn suppress parameter to arm_gen_branch_link() - ARM: 9079/1: ftrace: Add MODULE_PLTS support - ARM: 9098/1: ftrace: MODULE_PLT: Fix build problem without DYNAMIC_FTRACE - sctp: validate chunk size in __rcv_asconf_lookup - sctp: add param size validation for SCTP_PARAM_SET_PRIMARY - staging: rtl8192u: Fix bitwise vs logical operator in TranslateRxSignalStuff819xUsb() - um: virtio_uml: fix memory leak on init failures - dmaengine: acpi: Avoid comparison GSI with Linux vIRQ - thermal/drivers/exynos: Fix an error code in exynos_tmu_probe() - 9p/trans_virtio: Remove sysfs file on probe failure - prctl: allow to setup brk for et_dyn executables - nilfs2: use refcount_dec_and_lock() to fix potential UAF - profiling: fix shift-out-of-bounds bugs - pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered - phy: avoid unnecessary link-up delay in polling mode - net: stmmac: reset Tx desc base address before restarting Tx - Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH - thermal/core: Fix thermal_cooling_device_register() prototype - drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION() - parisc: Move pci_dev_is_behind_card_dino to where it is used - dmaengine: sprd: Add missing MODULE_DEVICE_TABLE - dmaengine: ioat: depends on !UML - dmaengine: xilinx_dma: Set DMA mask for coherent APIs - ceph: request Fw caps before updating the mtime in ceph_write_iter - ceph: lockdep annotations for try_nonblocking_invalidate - btrfs: fix lockdep warning while mounting sprout fs - nilfs2: fix memory leak in nilfs_sysfs_create_device_group - nilfs2: fix NULL pointer in nilfs_##name##_attr_release - nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group - nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group - nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group - nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group - pwm: img: Don't modify HW state in .remove() callback - pwm: rockchip: Don't modify HW state in .remove() callback - pwm: stm32-lp: Don't modify HW state in .remove() callback - blk-throttle: fix UAF by deleteing timer in blk_throtl_exit() - rtc: rx8010: select REGMAP_I2C - drm/nouveau/nvkm: Replace -ENOSYS with -ENODEV - Linux 5.4.149 -- Kleber Sacilotto de Souza <kleber.so...@canonical.com> Fri, 05 Nov 2021 17:02:56 +0100 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1948862 Title: KVM emulation failure when booting into VM crash kernel with multiple CPUs Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Status in linux source package in Focal: Fix Released Bug description: [Impact] When kexec'ing into a crash kernel with ncpus > 1, VMs can raise a KVM emulation failure. This will cause the VM to go into the "paused" state, and prevents it from being restored without a full VM restart. This happens only when there are multiple enabled CPUs in the crash kernel command-line, regardless of whether `nr_cpus` or `maxcpus` is being used. Due to the vCPU MMU state not being cleaned up correctly, the secondary CPUs try to access virtual addresses with a faulty MMU context that will result in the emulation failure. This shows up with a similar spew as below: $ sudo tail -n20 /var/log/libvirt/qemu/focal-vm.log KVM internal error. Suberror: 1 emulation failure EAX=0000de8f EBX=00000000 ECX=0000008f EDX=00000600 ESI=00000000 EDI=00000000 EBP=00000000 ESP=0000f90c EIP=0000cdb1 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =f000 000f0000 0000ffff 00009b00 SS =de00 000de000 0000ffff 00009300 DS =de00 000de000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=290b8001 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=66 83 c4 28 66 5b 66 c3 66 56 66 53 66 52 b1 8f 88 c8 e6 70 <e4> 71 66 0f b6 f0 66 89 f2 67 88 54 24 03 88 c8 e6 70 66 31 db 88 d8 e6 71 66 56 66 68 1a [Test Plan] 1. Boot an Ubuntu guest VM with e.g. multipass: $ multipass launch daily:focal -c8 -m16g -n focal-vm 2. Configure guest crash kernel command-line with `nr_cpus=8`: ubuntu@focal-vm:~$ grep CMDLINE_APPEND /etc/default/kdump-tools # KDUMP_CMDLINE_APPEND - Additional arguments to append to the command line KDUMP_CMDLINE_APPEND="reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=8 irqpoll nousb ata_piix.prefer_ms_hyperv=0" 3. Crash guest VM and watch for the KVM emulation failure: ubuntu@focal-vm:~$ echo c | sudo tee /proc/sysrq-trigger [Where problems could occur] As we're resetting MMU context on vCPUs, potential regressions would show up in workloads relying on KVM guests. We should properly test the scenario mentioned in the bug to make sure secondary CPUs are being cleaned up properly, and that no other regressions have been introduced when rebooting or kexec'ing into different kernels. Since we're adding an MMU reset at kvm_vcpu_reset(), the overall regression potential should be fairly low and contained to starting/resetting vCPUs (i.e. VM start and reboot). [Other info] This has been fixed by upstream commit: 0aa1837533e5 KVM: x86: Properly reset MMU context at vCPU RESET/INIT The commit above has been picked up by stable trees up until 5.11, so it's only needed in Bionic and Focal (4.15 and 5.4 kernels). There are also two follow up commits, which revert the vendor-specific resets: 5d2d7e41e3b8 KVM: SVM: Drop explicit MMU reset at RESET/INIT 61152cd907d5 KVM: VMX: Remove explicit MMU reset in enter_rmode() These follow ups have not been picked up in stable trees due to the risk of regressions. According to the original fix, they have been introduced primarily to aid bisection in case there are workflows relying on the vendor resets. As these are not required for the fix and don't conflict with the backport, we should leave them out to prevent potential regressions in the older kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1948862/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp