For reference, the kernel spew of the BUG_ON: [ 78.354129] kernel BUG at /home/ubuntu/xenial-aws/drivers/nvme/host/pci.c:619! [ 78.357297] invalid opcode: 0000 [#1] SMP [ 78.359613] Modules linked in: dm_snapshot dm_bufio xfs ppdev serio_raw parport_pc 8250_fintek parport i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ena [ 78.387878] CPU: 0 PID: 1687 Comm: mount Not tainted 4.4.0-1105-aws #116 [ 78.390837] Hardware name: Amazon EC2 c5d.large/, BIOS 1.0 10/16/2017 [ 78.393692] task: ffff8800bb155400 ti: ffff8800b93bc000 task.ti: ffff8800b93bc000 [ 78.396973] RIP: 0010:[<ffffffff815dbd06>] [<ffffffff815dbd06>] nvme_queue_rq+0x8c6/0xa60 [ 78.400787] RSP: 0018:ffff8800b93bf7c8 EFLAGS: 00010286 [ 78.403151] RAX: 0000000000000078 RBX: 0000000000001000 RCX: 0000000000001000 [ 78.406276] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000 [ 78.409390] RBP: ffff8800b93bf8a8 R08: ffff8800b916c700 R09: 0000000000001000 [ 78.412518] R10: 000000000001ec00 R11: ffff8800b8e30000 R12: 00000000fffffc00 [ 78.417056] R13: 0000000000000010 R14: 000000000000fc00 R15: 0000000035fd5000 [ 78.421581] FS: 00007f30fe043840(0000) GS:ffff880130a00000(0000) knlGS:0000000000000000 [ 78.427884] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 78.431827] CR2: 00007f57d4057889 CR3: 0000000035974000 CR4: 0000000000360670 [ 78.436322] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 78.440821] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 78.445316] Stack: [ 78.447706] ffff880036009480 ffff880036009700 ffff8800b7782800 0000000000000ff8 [ 78.454583] ffff8800b8e30420 ffff8800360a9400 ffff88000001fc00 ffff8800b7697b00 [ 78.461462] ffff880100001000 ffff8800b8e30000 ffff88003604c000 00000001ffc00400 [ 78.468332] Call Trace: [ 78.470921] [<ffffffff813e6617>] blk_mq_make_request+0x407/0x550 [ 78.475001] [<ffffffff813d8f14>] generic_make_request+0x114/0x2d0 [ 78.479110] [<ffffffff813d0371>] ? bvec_alloc+0x91/0x100 [ 78.482936] [<ffffffff813d9146>] submit_bio+0x76/0x160 [ 78.486680] [<ffffffffc0347a14>] _xfs_buf_ioapply+0x2e4/0x4a0 [xfs] [ 78.490866] [<ffffffff810b22e0>] ? wake_up_q+0x70/0x70 [ 78.494601] [<ffffffffc0349c94>] ? xfs_bwrite+0x24/0x60 [xfs] [ 78.498583] [<ffffffffc034975d>] xfs_buf_submit_wait+0x5d/0x230 [xfs] [ 78.502861] [<ffffffffc0349c94>] xfs_bwrite+0x24/0x60 [xfs] [ 78.506785] [<ffffffffc037108f>] xlog_bwrite+0x7f/0x100 [xfs] [ 78.510787] [<ffffffffc0371f34>] xlog_write_log_records+0x1a4/0x230 [xfs] [ 78.515192] [<ffffffffc0372077>] xlog_clear_stale_blocks+0xb7/0x1b0 [xfs] [ 78.519596] [<ffffffffc037198f>] ? xlog_bread+0x3f/0x50 [xfs] [ 78.523588] [<ffffffffc03765eb>] xlog_find_tail+0x2db/0x3b0 [xfs] [ 78.527705] [<ffffffffc03766ed>] xlog_recover+0x2d/0x160 [xfs] [ 78.531720] [<ffffffffc036a11b>] xfs_log_mount+0xdb/0x2a0 [xfs] [ 78.535767] [<ffffffffc03612e3>] xfs_mountfs+0x4f3/0x870 [xfs] [ 78.539788] [<ffffffffc036216b>] ? xfs_mru_cache_create+0x12b/0x180 [xfs] [ 78.544197] [<ffffffffc036463b>] xfs_fs_fill_super+0x3bb/0x4e0 [xfs] [ 78.548400] [<ffffffff8121dc70>] mount_bdev+0x270/0x2c0 [ 78.552169] [<ffffffffc0364280>] ? xfs_parseargs+0xab0/0xab0 [xfs] [ 78.556338] [<ffffffffc03628e5>] xfs_fs_mount+0x15/0x20 [xfs] [ 78.560337] [<ffffffff8121e65d>] mount_fs+0x3d/0x170 [ 78.564091] [<ffffffff811bc405>] ? __alloc_percpu+0x15/0x20 [ 78.568060] [<ffffffff8123b257>] vfs_kern_mount+0x67/0x110 [ 78.571971] [<ffffffff8123d95f>] do_mount+0x25f/0xda0 [ 78.575692] [<ffffffff8123bad4>] ? mntput+0x24/0x40 [ 78.579334] [<ffffffff811fbf06>] ? __kmalloc_track_caller+0x1b6/0x250 [ 78.583595] [<ffffffff8121c483>] ? __fput+0x193/0x230 [ 78.587296] [<ffffffff811b6952>] ? memdup_user+0x42/0x70 [ 78.591111] [<ffffffff8123e7df>] SyS_mount+0x9f/0x100 [ 78.594804] [<ffffffff818449db>] entry_SYSCALL_64_fastpath+0x22/0xcb [ 78.599029] Code: 11 e3 e3 ff 44 8b 95 50 ff ff ff 48 89 85 68 ff ff ff 4c 8b 48 10 44 8b 58 18 8b 95 58 ff ff ff 8b 8d 60 ff ff ff e9 0a fd ff ff <0f> 0b 48 8b 73 68 48 8b bd 70 ff ff ff e8 58 c5 e2 ff 83 f8 01 [ 78.625198] RIP [<ffffffff815dbd06>] nvme_queue_rq+0x8c6/0xa60 [ 78.629410] RSP <ffff8800b93bf7c8> [ 78.632442] ---[ end trace de20412ccd13806e ]---
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1869229 Title: Mounting LVM snapshots with xfs can hit kernel BUG in nvme driver Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Confirmed Bug description: [Impact] When mounting LVM snapshots using xfs, it's possible to hit a BUG_ON() in nvme driver. Upstream commit 729204ef49ec ("block: relax check on sg gap") introduced a way to merge bios if they are physically contiguous. This can lead to issues if one rq starts with a non-aligned buffer, as it can cause the merged segment to end in an unaligned virtual boundary. In some AWS instances, it's possible to craft such a request when attempting to mount LVM snapshots using xfs. This will then cause a kernel spew due to a BUG_ON in nvme_setup_prps(), which checks if dma_len is aligned to the page size. [Fix] Upstream commit 5a8d75a1b8c9 ("block: fix bio_will_gap() for first bvec with offset") prevents requests that begin with an unaligned buffer from being merged. [Test Case] This has been verified on AWS with c5d.large instances: 1) Prepare the LVM device + snapshot $ sudo vgcreate vg0 /dev/nvme1n1 $ sudo lvcreate -L5G -n data0 vg0 $ sudo mkfs.xfs /dev/vg0/data0 $ sudo mount /dev/vg0/data0 /mnt $ sudo touch /mnt/test $ sudo touch /mnt/test2 $ sudo ls /mnt $ sudo umount /mnt $ sudo lvcreate -l100%FREE -s /dev/vg0/data0 -n data0_snap 2) Attempting to mount the previously created snapshot results in the Oops: $ sudo mount /dev/vg0/data0_snap /mnt Segmentation fault (core dumped) [Regression Potential] The fix prevents some bios from being merged, so it can have a performance impact in certain scenarios. The patch only targets misaligned segments, so the impact should be less noticeable in the general case. The commit is also present in mainline kernels since 4.13, and hasn't been changed significantly, so potential for other regressions should be low. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1869229/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

