The pre-patch discussion thread has a test program that I used to reproduce the issue. The test program never completes if the bug is present.
I have not been through this process yet. Is it appropriate for me to do the testing? If yes then is there a document or steps that describes the appropriate way to test? https://lore.kernel.org/io- uring/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/T/#u The following is a copy of the test program: Test program usage: ./io_uring_open_close_audit_hang --directory /tmp/deleteme --count 10000 Test program source: // Note: The test program is C++ but could be converted to C. #include <cassert> #include <fcntl.h> #include <filesystem> #include <getopt.h> #include <iostream> #include <liburing.h> // open and close a file. the file is created if it does not exist. void openClose(struct io_uring& ring, std::string fileName) { int ret; struct io_uring_cqe* cqe {}; struct io_uring_sqe* sqe {}; int fd {}; int flags {O_RDWR | O_CREAT}; mode_t mode {0666}; // openat2 sqe = io_uring_get_sqe(&ring); assert(sqe != nullptr); io_uring_prep_openat(sqe, AT_FDCWD, fileName.data(), flags, mode); io_uring_sqe_set_flags(sqe, IOSQE_ASYNC); ret = io_uring_submit(&ring); assert(ret == 1); ret = io_uring_wait_cqe(&ring, &cqe); assert(ret == 0); fd = cqe->res; assert(fd > 0); io_uring_cqe_seen(&ring, cqe); // close sqe = io_uring_get_sqe(&ring); assert(sqe != nullptr); io_uring_prep_close(sqe, fd); io_uring_sqe_set_flags(sqe, IOSQE_ASYNC); ret = io_uring_submit(&ring); assert(ret == 1); // wait for the close to complete. ret = io_uring_wait_cqe(&ring, &cqe); assert(ret == 0); // verify that close succeeded. assert(cqe->res == 0); io_uring_cqe_seen(&ring, cqe); } // create 100 files and then open each file twice. void openCloseHang(std::string filePath) { int ret; struct io_uring ring; ret = io_uring_queue_init(8, &ring, 0); assert(0 == ret); int repeat {3}; int numFiles {100}; std::filesystem::create_directory(filePath); // files of length 0 are created in the j==0 iteration below. // those files are opened and closed during the j>0 iteraions. // a repeat of 3 results in a fairly reliable reproduction. for (int j = 0; j < repeat; j += 1) { for (int i = 0; i < numFiles; i += 1) { std::string fileName(filePath + "/file" + std::to_string(i)); openClose(ring, fileName); } } std::filesystem::remove_all(filePath); io_uring_queue_exit(&ring); } int main(int argc, char** argv) { std::string filePath {}; int iterations {}; struct option options[] { {"help", no_argument, 0, 'h'}, {"directory", required_argument, 0, 'd'}, {"count", required_argument, 0, 'c'}, { 0, 0, 0, 0 } }; bool printUsage {false}; int val {}; while ((val = getopt_long_only(argc, argv, "", options, nullptr)) != -1) { if (val == 'h') { printUsage = true; } else if (val == 'd') { filePath = optarg; if (std::filesystem::exists(filePath)) { printUsage = true; std::cerr << "directory must not exist" << std::endl; } } else if (val == 'c') { iterations = atoi(optarg); if (0 == iterations) { printUsage = true; } } else { printUsage = true; } } if ((0 == iterations) || (filePath.empty())) { printUsage = true; } if (printUsage || (optind < argc)) { std::cerr << "io_uring_open_close_audit_hang.cc --directory DIR --count COUNT" << std::endl; exit(1); } for (int i = 0; i < iterations; i += 1) { if (0 == (i % 100)) { std::cout << "i=" << std::to_string(i) << std::endl; } openCloseHang(filePath); } return 0; } -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2043841 Title: kernel BUG: io_uring openat triggers audit reference count underflow Status in linux package in Ubuntu: Fix Released Status in linux source package in Lunar: Fix Committed Status in linux source package in Mantic: Fix Committed Bug description: I first encountered a bug in 6.2.0-1012-azure #12~22.04.1-Ubuntu that occurs during io_uring openat audit processing. I have a kernel patch that was accepted into the upstream kernel as well as the v6.6, v6.5.9, and v6.1.60 releases. The bug was first introduced in the upstream v5.16 kernel. I do not see the change yet in: * The Ubuntu-azure-6.2-6.2.0-1017.17_22.04.1 tag in the jammy kernel repository. * The Ubuntu-azure-6.5.0-1009.9 tag in the mantic kernel repository. Can this upstream commit be cherry picked? The upstream commit is: 03adc61edad49e1bbecfb53f7ea5d78f398fe368 The upstream patch thread is: https://lore.kernel.org/audit/20231012215518.ga4...@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/T/#u The maintainer pull request thread is: https://lore.kernel.org/lkml/20231019-kampfsport- metapher-e5211d7be247@brauner The pre-patch discussion thread is: https://lore.kernel.org/io- uring/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/T/#u The commit log message is: commit 03adc61edad49e1bbecfb53f7ea5d78f398fe368 Author: Dan Clash <dacl...@linux.microsoft.com> Date: Thu Oct 12 14:55:18 2023 -0700 audit,io_uring: io_uring openat triggers audit reference count underflow An io_uring openat operation can update an audit reference count from multiple threads resulting in the call trace below. A call to io_uring_submit() with a single openat op with a flag of IOSQE_ASYNC results in the following reference count updates. These first part of the system call performs two increments that do not race. do_syscall_64() __do_sys_io_uring_enter() io_submit_sqes() io_openat_prep() __io_openat_prep() getname() getname_flags() /* update 1 (increment) */ __audit_getname() /* update 2 (increment) */ The openat op is queued to an io_uring worker thread which starts the opportunity for a race. The system call exit performs one decrement. do_syscall_64() syscall_exit_to_user_mode() syscall_exit_to_user_mode_prepare() __audit_syscall_exit() audit_reset_context() putname() /* update 3 (decrement) */ The io_uring worker thread performs one increment and two decrements. These updates can race with the system call decrement. io_wqe_worker() io_worker_handle_work() io_wq_submit_work() io_issue_sqe() io_openat() io_openat2() do_filp_open() path_openat() __audit_inode() /* update 4 (increment) */ putname() /* update 5 (decrement) */ __audit_uring_exit() audit_reset_context() putname() /* update 6 (decrement) */ The fix is to change the refcnt member of struct audit_names from int to atomic_t. kernel BUG at fs/namei.c:262! Call Trace: ... ? putname+0x68/0x70 audit_reset_context.part.0.constprop.0+0xe1/0x300 __audit_uring_exit+0xda/0x1c0 io_issue_sqe+0x1f3/0x450 ? lock_timer_base+0x3b/0xd0 io_wq_submit_work+0x8d/0x2b0 ? __try_to_del_timer_sync+0x67/0xa0 io_worker_handle_work+0x17c/0x2b0 io_wqe_worker+0x10a/0x350 Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/lkml/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/ Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring") Signed-off-by: Dan Clash <dacl...@linux.microsoft.com> Link: https://lore.kernel.org/r/20231012215518.ga4...@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Reviewed-by: Jens Axboe <ax...@kernel.dk> Signed-off-by: Christian Brauner <brau...@kernel.org> To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2043841/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp