This bug was fixed in the package linux - 4.15.0-23.25
---------------
linux (4.15.0-23.25) bionic; urgency=medium
* linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)
* arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
- arm64: mmu: add the entry trampolines start/end section markers into
sections.h
- arm64: sdei: Add trampoline code for remapping the kernel
* Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
- ACPI: APEI: handle PCIe AER errors in separate function
- ACPI: APEI: call into AER handling regardless of severity
* qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
- scsi: qla2xxx: Fix session cleanup for N2N
- scsi: qla2xxx: Remove unused argument from
qlt_schedule_sess_for_deletion()
- scsi: qla2xxx: Serialize session deletion by using work_lock
- scsi: qla2xxx: Serialize session free in qlt_free_session_done
- scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
- scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
- scsi: qla2xxx: Prevent relogin trigger from sending too many commands
- scsi: qla2xxx: Fix double free bug after firmware timeout
- scsi: qla2xxx: Fixup locking for session deletion
* Several hisi_sas bug fixes (LP: #1768974)
- scsi: hisi_sas: dt-bindings: add an property of signal attenuation
- scsi: hisi_sas: support the property of signal attenuation for v2 hw
- scsi: hisi_sas: fix the issue of link rate inconsistency
- scsi: hisi_sas: fix the issue of setting linkrate register
- scsi: hisi_sas: increase timer expire of internal abort task
- scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
- scsi: hisi_sas: fix return value of hisi_sas_task_prep()
- scsi: hisi_sas: Code cleanup and minor bug fixes
* [bionic] machine stuck and bonding not working well when nvmet_rdma module
is loaded (LP: #1764982)
- nvmet-rdma: Don't flush system_wq by default during remove_one
- nvme-rdma: Don't flush delete_wq by default during remove_one
* Warnings/hang during error handling of SATA disks on SAS controller
(LP: #1768971)
- scsi: libsas: defer ata device eh commands to libata
* Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
- ata: do not schedule hot plug if it is a sas host
* ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
- powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
- powerpc/64s: return more carefully from sreset NMI
- powerpc/64s: sreset panic if there is no debugger or crash dump handlers
* fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
- fsnotify: Fix fsnotify_mark_connector race
* Hang on network interface removal in Xen virtual machine (LP: #1771620)
- xen-netfront: Fix hang on device removal
* HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
- net: hns: Avoid action name truncation
* Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
- SAUCE: powerpc/perf: Fix memory allocation for core-imc based on
num_possible_cpus()
* Switch Build-Depends: transfig to fig2dev (LP: #1770770)
- [Config] update Build-Depends: transfig to fig2dev
* smp_call_function_single/many core hangs with stop4 alone (LP: #1768898)
- cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer
interrupt
* Add d-i support for Huawei NICs (LP: #1767490)
- d-i: add hinic to nic-modules udeb
* unregister_netdevice: waiting for eth0 to become free. Usage count = 5
(LP: #1746474)
- xfrm: reuse uncached_list to track xdsts
* Include nfp driver in linux-modules (LP: #1768526)
- [Config] Add nfp.ko to generic inclusion list
* Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679)
- x86/xen: Reset VCPU0 info pointer after shared_info remap
* CVE-2018-3639 (x86)
- x86/bugs: Fix the parameters alignment and missing void
- KVM: SVM: Move spec control call after restore of GS
- x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
- x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
- x86/cpufeatures: Disentangle SSBD enumeration
- x86/cpufeatures: Add FEATURE_ZEN
- x86/speculation: Handle HT correctly on AMD
- x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
- x86/speculation: Add virtualized speculative store bypass disable support
- x86/speculation: Rework speculative_store_bypass_update()
- x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
- x86/bugs: Expose x86_spec_ctrl_base directly
- x86/bugs: Remove x86_spec_ctrl_set()
- x86/bugs: Rework spec_ctrl base and mask logic
- x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
- KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
- x86/bugs: Rename SSBD_NO to SSB_NO
- bpf: Prevent memory disambiguation attack
- KVM: VMX: Expose SSBD properly to guests.
* Suspend to idle: Open lid didn't resume (LP: #1771542)
- ACPI / PM: Do not reconfigure GPEs for suspend-to-idle
* Fix initialization failure detection in SDEI for device-tree based systems
(LP: #1768663)
- firmware: arm_sdei: Fix return value check in sdei_present_dt()
* No driver for Huawei network adapters on arm64 (LP: #1769899)
- net-next/hinic: add arm64 support
* CVE-2018-1092
- ext4: fail ext4_iget for root directory if unallocated
* kernel 4.15 breaks nouveau on Lenovo P50 (LP: #1763189)
- drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
* update-initramfs not adding i915 GuC firmware for Kaby Lake, firmware fails
to load (LP: #1728238)
- Revert "UBUNTU: SAUCE: (no-up) i915: Remove MODULE_FIRMWARE statements for
unreleased firmware"
* Battery drains when laptop is off (shutdown) (LP: #1745646)
- PCI / PM: Check device_may_wakeup() in pci_enable_wake()
* Dell Latitude 5490/5590 BIOS update 1.1.9 causes black screen at boot
(LP: #1764194)
- drm/i915/bios: filter out invalid DDC pins from VBT child devices
* Intel 9462 A370:42A4 doesn't work (LP: #1748853)
- iwlwifi: add shared clock PHY config flag for some devices
- iwlwifi: add a bunch of new 9000 PCI IDs
* Fix an issue that some PCI devices get incorrectly suspended (LP: #1764684)
- PCI / PM: Always check PME wakeup capability for runtime wakeup support
* [SRU][Bionic/Artful] fix false positives in W+X checking (LP: #1769696)
- init: fix false positives in W+X checking
* Bionic update to v4.15.18 stable release (LP: #1769723)
- netfilter: ipset: Missing nfnl_lock()/nfnl_unlock() is added to
ip_set_net_exit()
- cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
- rds: MP-RDS may use an invalid c_path
- slip: Check if rstate is initialized before uncompressing
- vhost: fix vhost_vq_access_ok() log check
- l2tp: fix races in tunnel creation
- l2tp: fix race in duplicate tunnel detection
- ip_gre: clear feature flags when incompatible o_flags are set
- vhost: Fix vhost_copy_to_user()
- lan78xx: Correctly indicate invalid OTP
- media: v4l2-compat-ioctl32: don't oops on overlay
- media: v4l: vsp1: Fix header display list status check in continuous mode
- ipmi: Fix some error cleanup issues
- parisc: Fix out of array access in match_pci_device()
- parisc: Fix HPMC handler by increasing size to multiple of 16 bytes
- Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
- PCI: hv: Serialize the present and eject work items
- PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()
- KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
- perf/core: Fix use-after-free in uprobe_perf_close()
- x86/mce/AMD: Get address from already initialized block
- hwmon: (ina2xx) Fix access to uninitialized mutex
- ath9k: Protect queue draining by rcu_read_lock()
- x86/apic: Fix signedness bug in APIC ID validity checks
- f2fs: fix heap mode to reset it back
- block: Change a rcu_read_{lock,unlock}_sched() pair into
rcu_read_{lock,unlock}()
- nvme: Skip checking heads without namespaces
- lib: fix stall in __bitmap_parselist()
- blk-mq: order getting budget and driver tag
- blk-mq: don't keep offline CPUs mapped to hctx 0
- ovl: fix lookup with middle layer opaque dir and absolute path redirects
- xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
- hugetlbfs: fix bug in pgoff overflow checking
- nfsd: fix incorrect umasks
- scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
- block/loop: fix deadlock after loop_set_status
- nfit: fix region registration vs block-data-window ranges
- s390/qdio: don't retry EQBS after CCQ 96
- s390/qdio: don't merge ERROR output buffers
- s390/ipl: ensure loadparm valid flag is set
- get_user_pages_fast(): return -EFAULT on access_ok failure
- mm/gup_benchmark: handle gup failures
- getname_kernel() needs to make sure that ->name != ->iname in long case
- Bluetooth: Fix connection if directed advertising and privacy is used
- Bluetooth: hci_bcm: Treat Interrupt ACPI resources as always being active-
low
- rtl8187: Fix NULL pointer dereference in priv->conf_mutex
- ovl: set lower layer st_dev only if setting lower st_ino
- Linux 4.15.18
* Kernel bug when unplugging Thunderbolt 3 cable, leaves xHCI host controller
dead (LP: #1768852)
- xhci: Fix Kernel oops in xhci dbgtty
* Incorrect blacklist of bcm2835_wdt (LP: #1766052)
- [Packaging] Fix missing watchdog for Raspberry Pi
* CVE-2018-8087
- mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()
* Integrated Webcam Realtek Integrated_Webcam_HD (0bda:58f4) not working in
DELL XPS 13 9370 with firmware 1.50 (LP: #1763748)
- SAUCE: media: uvcvideo: Support realtek's UVC 1.5 device
* [ALSA] [PATCH] Clevo P950ER ALC1220 Fixup (LP: #1769721)
- SAUCE: ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup
* Bionic: Intermittently sent to Emergency Mode on boot with unhandled kernel
NULL pointer dereference at 0000000000000980 (LP: #1768292)
- thunderbolt: Prevent crash when ICM firmware is not running
* linux-snapdragon: reduce EPROBEDEFER noise during boot (LP: #1768761)
- [Config] snapdragon: DRM_I2C_ADV7511=y
* regression Aquantia Corp. AQC107 4.15.0-13-generic -> 4.15.0-20-generic ?
(LP: #1767088)
- net: aquantia: Regression on reset with 1.x firmware
- net: aquantia: oops when shutdown on already stopped device
* e1000e msix interrupts broken in linux-image-4.15.0-15-generic
(LP: #1764892)
- e1000e: Remove Other from EIAC
* Acer Swift sf314-52 power button not managed (LP: #1766054)
- SAUCE: platform/x86: acer-wmi: add another KEY_POWER keycode
* set PINCFG_HEADSET_MIC to parse_flags for Dell precision 3630 (LP: #1766398)
- ALSA: hda/realtek - set PINCFG_HEADSET_MIC to parse_flags
* Change the location for one of two front mics on a lenovo thinkcentre
machine (LP: #1766477)
- ALSA: hda/realtek - adjust the location of one mic
* SRU: bionic: apply 50 ZFS upstream bugfixes (LP: #1764690)
- SAUCE: (noup) Update zfs to 0.7.5-1ubuntu15 (LP: #1764690)
* [8086:3e92] display becomes blank after S3 (LP: #1763271)
- drm/i915/edp: Do not do link training fallback or prune modes on EDP
-- Stefan Bader <[email protected]> Wed, 23 May 2018 18:54:55
+0200
** Changed in: linux (Ubuntu Bionic)
Status: Fix Committed => Fix Released
** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-1092
** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-3639
** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-8087
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1764982
Title:
[bionic] machine stuck and bonding not working well when nvmet_rdma
module is loaded
Status in linux package in Ubuntu:
In Progress
Status in linux source package in Bionic:
Fix Released
Bug description:
== SRU Justification ==
This bug causes the machine to get stuck and bonding to not work when
the nvmet_rdma module is loaded.
Both of these commits are in mainline as of v4.17-rc1.
== Fixes ==
a3dd7d0022c3 ("nvmet-rdma: Don't flush system_wq by default during
remove_one")
9bad0404ecd7 ("nvme-rdma: Don't flush delete_wq by default during remove_one")
== Regression Potential ==
Low. Limited to nvme driver and tested by Mellanox.
== Test Case ==
A test kernel was built with these patches and tested by the original bug
reporter.
The bug reporter states the test kernel resolved the bug.
== Original Bug Description ==
Hi
Machine stuck after unregistering bonding interface when the nvmet_rdma
module is loading.
scenario:
# modprobe nvmet_rdma
# modprobe -r bonding
# modprobe bonding -v mode=1 miimon=100 fail_over_mac=0
# ifdown eth4
# ifdown eth5
# ip addr add 15.209.12.173/8 dev bond0
# ip link set bond0 up
# echo +eth5 > /sys/class/net/bond0/bonding/slaves
# echo +eth4 > /sys/class/net/bond0/bonding/slaves
# echo -eth4 > /sys/class/net/bond0/bonding/slaves
# echo -eth5 > /sys/class/net/bond0/bonding/slaves
# echo -bond0 > /sys/class/net/bonding_masters
dmesg:
kernel: [78348.225556] bond0 (unregistering): Released all slaves
kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become
free. Usage count = 2
The following upstream commits that fix this issue
commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1
Author: Max Gurtovoy <[email protected]>
Date: Wed Feb 28 13:12:38 2018 +0200
nvmet-rdma: Don't flush system_wq by default during remove_one
The .remove_one function is called for any ib_device removal.
In case the removed device has no reference in our driver, there
is no need to flush the system work queue.
Reviewed-by: Israel Rukshin <[email protected]>
Signed-off-by: Max Gurtovoy <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index aa8068f..a59263d 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = {
static void nvmet_rdma_remove_one(struct ib_device *ib_device, void
*client_data)
{
struct nvmet_rdma_queue *queue, *tmp;
+ struct nvmet_rdma_device *ndev;
+ bool found = false;
+
+ mutex_lock(&device_list_mutex);
+ list_for_each_entry(ndev, &device_list, entry) {
+ if (ndev->device == ib_device) {
+ found = true;
+ break;
+ }
+ }
+ mutex_unlock(&device_list_mutex);
+
+ if (!found)
+ return;
- /* Device is being removed, delete all queues using this device */
+ /*
+ * IB Device that is used by nvmet controllers is being removed,
+ * delete all queues using this device.
+ */
mutex_lock(&nvmet_rdma_queue_mutex);
list_for_each_entry_safe(queue, tmp, &nvmet_rdma_queue_list,
queue_list) {
commit 9bad0404ecd7594265cef04e176adeaa4ffbca4a
Author: Max Gurtovoy <[email protected]>
Date: Wed Feb 28 13:12:39 2018 +0200
nvme-rdma: Don't flush delete_wq by default during remove_one
The .remove_one function is called for any ib_device removal.
In case the removed device has no reference in our driver, there
is no need to flush the work queue.
Reviewed-by: Israel Rukshin <[email protected]>
Signed-off-by: Max Gurtovoy <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index f5f460b..250b277 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2024,6 +2024,20 @@ static struct nvmf_transport_ops nvme_rdma_transport =
{
static void nvme_rdma_remove_one(struct ib_device *ib_device, void
*client_data)
{
struct nvme_rdma_ctrl *ctrl;
+ struct nvme_rdma_device *ndev;
+ bool found = false;
+
+ mutex_lock(&device_list_mutex);
+ list_for_each_entry(ndev, &device_list, entry) {
+ if (ndev->dev == ib_device) {
+ found = true;
+ break;
+ }
+ }
+ mutex_unlock(&device_list_mutex);
+
+ if (!found)
+ return;
/* Delete all controllers using this device */
mutex_lock(&nvme_rdma_ctrl_mutex);
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1764982/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp