** Changed in: linux (Ubuntu Noble)
       Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu Oracular)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2091719

Title:
  btrfs will WARN_ON() in btrfs_remove_qgroup() unnecessarily

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Noble:
  Fix Committed
Status in linux source package in Oracular:
  Fix Committed

Bug description:
  BugLink: https://bugs.launchpad.net/bugs/2091719

  [Impact]

  The following commit for noble and oracular introduced two new WARN_ON() calls
  in btrfs qgroup removals, and even though the author at the time believed they
  would not be reachable, it turns out it can happen quite frequently in the
  right conditions.

  ubuntu-noble b2ad25ba539452f492805e5f7d94e80894aa860f
  commit a776bf5f3c2300cfdf8a195663460b1793ac9847
  Author: Qu Wenruo <w...@suse.com>
  Date: Fri Apr 19 14:29:32 2024 +0930
  Subject: btrfs: slightly loosen the requirement for qgroup removal
  Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a776bf5f3c2300cfdf8a195663460b1793ac9847

  $ git describe --contains b2ad25ba539452f492805e5f7d94e80894aa860f
  Ubuntu-6.8.0-50.51~143

  This primarily affects the systemd CI that runs integration tests on merge:
  
https://github.com/systemd/systemd/actions/runs/12297539029/job/34318915884?pr=35589

  Kernel panic - not syncing: kernel: panic_on_warn set ...
  CPU: 0 PID: 1316 Comm: (sd-clean) Not tainted 6.8.0-50-generic #51-Ubuntu
  Call Trace:
   <TASK>
   dump_stack_lvl+0x27/0xa0
   dump_stack+0x10/0x20
   panic+0x366/0x3c0
   ? btrfs_remove_qgroup+0x271/0x490 [btrfs]
   check_panic_on_warn+0x4f/0x60
   __warn+0x95/0x160
   ? btrfs_remove_qgroup+0x271/0x490 [btrfs]
   report_bug+0x17e/0x1b0
   handle_bug+0x51/0xa0
   exc_invalid_op+0x18/0x80
   asm_exc_invalid_op+0x1b/0x20
  RIP: 0010:btrfs_remove_qgroup+0x271/0x490 [btrfs]
  Code: c0 0f 85 27 fe ff ff 48 8b 43 b0 4c 39 f0 75 d5 4d 8d b5 e0 08 00 00 4c 
89 f7 e8 8a 45 19 e2 48 83 7b 98 00 0f 84 52 01 00 00 <0f> 0b 49 8b 45 10 a8 10 
74 42 41 f6 85 d0 08 00 00 0c 75 38 48 83
   ? btrfs_remove_qgroup+0x266/0x490 [btrfs]
   btrfs_ioctl+0x12b9/0x13a0 [btrfs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __seccomp_filter+0x368/0x570
   ? __fput+0x15e/0x2e0
   __x64_sys_ioctl+0xa3/0xf0
   x64_sys_call+0x12a3/0x25a0
   do_syscall_64+0x7f/0x180
   entry_SYSCALL_64_after_hwframe+0x78/0x80

  [Fix]

  The fix just landed in mainline as:

  commit c0def46dec9c547679a25fe7552c4bcbec0b0dd2
  Author: Qu Wenruo <w...@suse.com>
  Date:   Mon Nov 11 07:29:07 2024 +1030
  Subject: btrfs: improve the warning and error message for 
btrfs_remove_qgroup()
  Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c0def46dec9c547679a25fe7552c4bcbec0b0dd2

  The commit places the WARN_ON behind CONFIG_BTRFS_DEBUG, which silences the
  warning for most users. It is safe to do so, as noted by the Author, as
  the user space tool managing the qgroups would rescan them, to fix the
  inconsistent view.

  This is needed for both noble and oracular.

  [Testcase]

  The upstream systemd CI tests can consistently reproduce the issue, so the 
test
  and proposed kernels will be run against the systemd CI for verification.

  There is a test kernel available in the following ppa:

  https://launchpad.net/~mruffell/+archive/ubuntu/lp2091719-test

  If you install it, the systemd CI will run to completion.

  [Where problems could occur]

  We are changing the WARN_ON() to occur only when CONFIG_BTRFS_DEBUG is 
enabled.
  There is no other change in logic, so functionality should be the same as what
  we have now.

  If a regression were to occur, it would affect systems with btrfs filesystems
  that are utilising subvolumes. It would not likely cause any data loss or disk
  corruption, as userspace tools should be able to automatically fix up any
  inconsistent views without user interaction.

  [Other info]

  Systemd upstream bisected the issue here:
  https://github.com/systemd/systemd/pull/35567#issuecomment-2538160543

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2091719/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to