The pre-patch discussion thread has a test program that I used to reproduce the 
issue.
The test program never completes if the bug is present.

I have not been through this process yet.  Is it appropriate for me to
do the testing?  If yes then is there a document or steps that describes
the appropriate way to test?

https://lore.kernel.org/io-
uring/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/T/#u

The following is a copy of the test program:

Test program usage:

./io_uring_open_close_audit_hang --directory /tmp/deleteme --count 10000

Test program source:

// Note: The test program is C++ but could be converted to C.
#include <cassert>
#include <fcntl.h>
#include <filesystem>
#include <getopt.h>
#include <iostream>
#include <liburing.h>

// open and close a file.  the file is created if it does not exist.

void
openClose(struct io_uring& ring, std::string fileName)
{
    int ret;
    struct io_uring_cqe* cqe {};
    struct io_uring_sqe* sqe {};
    int fd {};
    int flags {O_RDWR | O_CREAT};
    mode_t mode {0666};

    // openat2

    sqe = io_uring_get_sqe(&ring);
    assert(sqe != nullptr);

    io_uring_prep_openat(sqe, AT_FDCWD, fileName.data(), flags, mode);
    io_uring_sqe_set_flags(sqe, IOSQE_ASYNC);

    ret = io_uring_submit(&ring);
    assert(ret == 1);

    ret = io_uring_wait_cqe(&ring, &cqe);
    assert(ret == 0);

    fd = cqe->res;
    assert(fd > 0);

    io_uring_cqe_seen(&ring, cqe);

    // close

    sqe = io_uring_get_sqe(&ring);
    assert(sqe != nullptr);

    io_uring_prep_close(sqe, fd);
    io_uring_sqe_set_flags(sqe, IOSQE_ASYNC);

    ret = io_uring_submit(&ring);
    assert(ret == 1);

    // wait for the close to complete.
    ret = io_uring_wait_cqe(&ring, &cqe);
    assert(ret == 0);

    // verify that close succeeded.
    assert(cqe->res == 0);

    io_uring_cqe_seen(&ring, cqe);
}

// create 100 files and then open each file twice.

void
openCloseHang(std::string filePath)
{
    int ret;
    struct io_uring ring;

    ret = io_uring_queue_init(8, &ring, 0);
    assert(0 == ret);

    int repeat {3};
    int numFiles {100};

    std::filesystem::create_directory(filePath);

    // files of length 0 are created in the j==0 iteration below.
    // those files are opened and closed during the j>0 iteraions.
    // a repeat of 3 results in a fairly reliable reproduction.

    for (int j = 0; j < repeat; j += 1) {
        for (int i = 0; i < numFiles; i += 1) {
            std::string fileName(filePath + "/file" + std::to_string(i));
            openClose(ring, fileName);
        }
    }

    std::filesystem::remove_all(filePath);

    io_uring_queue_exit(&ring);
}

int
main(int argc, char** argv)
{
    std::string filePath {};
    int iterations {};

    struct option options[]
    {
        {"help", no_argument, 0, 'h'}, {"directory", required_argument, 0, 'd'},
            {"count", required_argument, 0, 'c'},
        {
            0, 0, 0, 0
        }
    };
    bool printUsage {false};
    int val {};

    while ((val = getopt_long_only(argc, argv, "", options, nullptr)) != -1) {
        if (val == 'h') {
            printUsage = true;
        } else if (val == 'd') {
            filePath = optarg;
            if (std::filesystem::exists(filePath)) {
                printUsage = true;
                std::cerr << "directory must not exist" << std::endl;
            }
        } else if (val == 'c') {
            iterations = atoi(optarg);
            if (0 == iterations) {
                printUsage = true;
            }
        } else {
            printUsage = true;
        }
    }

    if ((0 == iterations) || (filePath.empty())) {
        printUsage = true;
    }

    if (printUsage || (optind < argc)) {
        std::cerr << "io_uring_open_close_audit_hang.cc --directory DIR --count 
COUNT" << std::endl;
        exit(1);
    }

    for (int i = 0; i < iterations; i += 1) {
        if (0 == (i % 100)) {
            std::cout << "i=" << std::to_string(i) << std::endl;
        }
        openCloseHang(filePath);
    }
    return 0;
}

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2043841

Title:
  kernel BUG: io_uring openat triggers audit reference count underflow

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Lunar:
  Fix Committed
Status in linux source package in Mantic:
  Fix Committed

Bug description:
  I first encountered a bug in 6.2.0-1012-azure #12~22.04.1-Ubuntu that
  occurs during io_uring openat audit processing.  I have a kernel patch
  that was accepted into the upstream kernel as well as the v6.6,
  v6.5.9, and v6.1.60 releases.  The bug was first introduced in the
  upstream v5.16 kernel.

  I do not see the change yet in:

  * The Ubuntu-azure-6.2-6.2.0-1017.17_22.04.1 tag in the jammy kernel 
repository.
  * The Ubuntu-azure-6.5.0-1009.9 tag in the mantic kernel repository.

  Can this upstream commit be cherry picked?

  The upstream commit is:

  03adc61edad49e1bbecfb53f7ea5d78f398fe368

  The upstream patch thread is:

  
https://lore.kernel.org/audit/20231012215518.ga4...@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/T/#u

  The maintainer pull request thread is:

  https://lore.kernel.org/lkml/20231019-kampfsport-
  metapher-e5211d7be247@brauner

  The pre-patch discussion thread is:

  https://lore.kernel.org/io-
  
uring/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/T/#u

  The commit log message is:

  commit 03adc61edad49e1bbecfb53f7ea5d78f398fe368
  Author: Dan Clash <dacl...@linux.microsoft.com>
  Date:   Thu Oct 12 14:55:18 2023 -0700

      audit,io_uring: io_uring openat triggers audit reference count
  underflow

      An io_uring openat operation can update an audit reference count
      from multiple threads resulting in the call trace below.

      A call to io_uring_submit() with a single openat op with a flag of
      IOSQE_ASYNC results in the following reference count updates.

      These first part of the system call performs two increments that
  do not race.

      do_syscall_64()
        __do_sys_io_uring_enter()
          io_submit_sqes()
            io_openat_prep()
              __io_openat_prep()
                getname()
                  getname_flags()       /* update 1 (increment) */
                    __audit_getname()   /* update 2 (increment) */

      The openat op is queued to an io_uring worker thread which starts the
      opportunity for a race.  The system call exit performs one decrement.

      do_syscall_64()
        syscall_exit_to_user_mode()
          syscall_exit_to_user_mode_prepare()
            __audit_syscall_exit()
              audit_reset_context()
                 putname()              /* update 3 (decrement) */

      The io_uring worker thread performs one increment and two decrements.
      These updates can race with the system call decrement.

      io_wqe_worker()
        io_worker_handle_work()
          io_wq_submit_work()
            io_issue_sqe()
              io_openat()
                io_openat2()
                  do_filp_open()
                    path_openat()
                      __audit_inode()   /* update 4 (increment) */
                  putname()             /* update 5 (decrement) */
              __audit_uring_exit()
                audit_reset_context()
                  putname()             /* update 6 (decrement) */

      The fix is to change the refcnt member of struct audit_names
      from int to atomic_t.

      kernel BUG at fs/namei.c:262!
      Call Trace:
      ...
       ? putname+0x68/0x70
       audit_reset_context.part.0.constprop.0+0xe1/0x300
       __audit_uring_exit+0xda/0x1c0
       io_issue_sqe+0x1f3/0x450
       ? lock_timer_base+0x3b/0xd0
       io_wq_submit_work+0x8d/0x2b0
       ? __try_to_del_timer_sync+0x67/0xa0
       io_worker_handle_work+0x17c/0x2b0
       io_wqe_worker+0x10a/0x350

      Cc: sta...@vger.kernel.org
      Link: 
https://lore.kernel.org/lkml/mw2pr2101mb1033fff044a258f84aeaa584f1...@mw2pr2101mb1033.namprd21.prod.outlook.com/
      Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support 
to io_uring")
      Signed-off-by: Dan Clash <dacl...@linux.microsoft.com>
      Link: 
https://lore.kernel.org/r/20231012215518.ga4...@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
      Reviewed-by: Jens Axboe <ax...@kernel.dk>
      Signed-off-by: Christian Brauner <brau...@kernel.org>

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2043841/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to