This bug is awaiting verification that the linux-azure/5.15.0-1006.7 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1973169 Title: [Azure][CVM] Fix swiotlb_max_mapping_size() for potential bounce buffer allocation failure in storvsc Status in linux-azure package in Ubuntu: New Status in linux-azure source package in Jammy: In Progress Bug description: SRU Justification [Impact] Description of problem: When the v5.15 linux-azure kernel is used for CVM on Azure, it uses swiotlb for bounce buffering. We recently found an issue in swiotlb_max_mapping_size(), which is used by the SCSI subsytem APIs, which are used by the hv_storvsc driver. The issue is: currently swiotlb_max_mapping_size() always reports 256KB (i.e. 128 bounce buffer slots), but swiotlb_tbl_map_single() is unable to allocate a bounce buffer for an unaligned 256KB request, and eventually it can get stuck and we see this call-trace (BTW, this call-trace is obtained from a SLES VM, but I believe the issue exists in all distro kernels supporting CVM, and Tianyu says he's able to repro the issue in a Ubuntu CVM when trying to mount a XFS file system): [ 186.458666][ C1] swiotlb_tbl_map_single+0x396/0x920 [ 186.458669][ C1] swiotlb_map+0xaa/0x2d0 [ 186.458674][ C1] dma_direct_map_sg+0xee/0x2c0 [ 186.458677][ C1] __dma_map_sg_attrs+0x30/0x70 [ 186.458680][ C1] dma_map_sg_attrs+0xa/0x20 [ 186.458681][ C1] scsi_dma_map+0x35/0x40 [ 186.458684][ C1] storvsc_queuecommand+0x20b/0x890 [ 186.458696][ C1] scsi_queue_rq+0x606/0xb80 [ 186.458698][ C1] __blk_mq_try_issue_directly+0x149/0x1c0 [ 186.458702][ C1] blk_mq_try_issue_directly+0x15/0x50 [ 186.458704][ C1] blk_mq_submit_bio+0x4b6/0x620 [ 186.458706][ C1] __submit_bio+0xe8/0x160 [ 186.458708][ C1] submit_bio_noacct_nocheck+0xf0/0x2b0 [ 186.458713][ C1] submit_bio+0x42/0xd0 [ 186.458714][ C1] submit_bio_wait+0x54/0xb0 [ 186.458718][ C1] xfs_rw_bdev+0x180/0x1b0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458769][ C1] xlog_do_io+0x8d/0x140 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458819][ C1] xlog_bread+0x1f/0x40 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458859][ C1] xlog_find_verify_cycle+0xc8/0x180 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458899][ C1] xlog_find_head+0x2ae/0x3a0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458937][ C1] xlog_find_tail+0x44/0x360 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.458978][ C1] xlog_recover+0x2b/0x170 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.459056][ C1] xfs_log_mount+0x15b/0x270 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.459098][ C1] xfs_mountfs+0x49e/0x830 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.459224][ C1] xfs_fs_fill_super+0x5c2/0x7c0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46] [ 186.459303][ C1] get_tree_bdev+0x163/0x260 [ 186.459307][ C1] vfs_get_tree+0x25/0xc0 [ 186.459309][ C1] path_mount+0x704/0x9c0 Details: For example, the original physical address from the SCSI layer can be 0x1_0903_f200 with size=256KB, and when swiotlb_tbl_map_single() calls swiotlb_find_slots(), it passes "alloc_size + offset" (i.e. 256KB + 0x200 ) to swiotlb_find_slots(), which then calculates "nslots = nr_slots(alloc_size) ==> 129" and fails to allocate a bounce buffer as the maximum allowable number of contiguous slabs to map is IO_TLB_SEGSIZE (128). The issue affects the hv_storvsc driver, as it calls dma_set_min_align_mask(&device->device, HV_HYP_PAGE_SIZE - 1); dma_set_min_align_mask() is also called by hv_netvsc, but netvsc is not affected as netvsc never calls swiotlb_tbl_map_single() with a near-to-256KB size. dma_set_min_align_mask() is also called by the NVMe driver, but since we don't support PCI device assignment for CVM for now, that's not affected for now. Tianyu Lan made a fix which is under review: https://lwn.net/ml/linux-kernel/20220510142109.777738-1-ltykernel%40gmail.com/ Note: the linux-azure-cvm v5.4 kernel doesn't need the fix, as that kernel uses a vmbus private bounce buffering implementation (drivers/hv/hv_bounce.c) rathen than swiotlb. [Test Case] Microsoft tested [Where things could go wrong] Bounce buffers may fail to allocate. [Other Info] SF: #00336634 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1973169/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp