So, I did test the -proposed package for mantic on a clean install by
creating a test-pool and running the zhammer.sh linked #16.

…
As expected on an unpatched system the error occurred during the first 
iteration:

[zhammer::1858] zhammer_1858_0 differed from zhammer_1858_538!
[zhammer::1858] Hexdump diff follows
--- zhammer_1858_0.hex  2024-02-03 12:44:07.478205144 +0000
+++ zhammer_1858_538.hex        2024-02-03 12:44:07.478205144 +0000
@@ -1,3 +1,3 @@
-00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
+00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00004000
[zhammer::1858] Uname: Linux zfstest 6.5.0-15-generic #15-Ubuntu SMP 
PREEMPT_DYNAMIC Tue Jan  9 17:03:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
[zhammer::1858] ZFS userspace: zfs-2.2.0-0ubuntu1~23.10.1
[zhammer::1858] ZFS kernel: zfs-kmod-2.2.0-0ubuntu1~23.10
[zhammer::1858] Module: /lib/modules/6.5.0-15-generic/kernel/zfs/zfs.ko.zst
[zhammer::1858] Srcversion: 92158472E32FE6AEEEC7201
[zhammer::1858] SHA256: 
177442f43f4c94537f8b003ab28ed33d00240c175e500370ad5bdd5c50234655
parallel: This job failed: zhammer /test 10000000 16k 10000 7


…
After enabling the -proposed repository, installing the updates and restarting 
the system is looks like the userspace-tools are now on the patched version 
(zfs-2.2.0-0ubuntu1~23.10.1), however the kernel module is still on the old 
version (without the .1) and, as expected, the bug is still reproducible:

[zhammer::1706] zhammer_1706_0 differed from zhammer_1706_1204!
[zhammer::1706] Hexdump diff follows
--- zhammer_1706_0.hex  2024-02-04 14:29:28.296850257 +0000
+++ zhammer_1706_1204.hex       2024-02-04 14:29:28.296850257 +0000
@@ -1,3 +1,3 @@
-00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
+00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00004000
[zhammer::1706] Uname: Linux zfstest 6.5.0-17-generic #17-Ubuntu SMP 
PREEMPT_DYNAMIC Thu Jan 11 14:01:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
[zhammer::1706] ZFS userspace: zfs-2.2.0-0ubuntu1~23.10.2
[zhammer::1706] ZFS kernel: zfs-kmod-2.2.0-0ubuntu1~23.10
[zhammer::1706] Module: /lib/modules/6.5.0-17-generic/kernel/zfs/zfs.ko.zst
[zhammer::1706] Srcversion: 92158472E32FE6AEEEC7201
[zhammer::1706] SHA256: 
0f6a069f6c3045e7c86507d7c158691d4ace8c6785888579652236fbdf8c66c0
parallel: This job failed: zhammer /test 10000000 16k 10000 2
…


Only when I am explicitly using the zfs-dkms package instead of the build-in 
kernel module, the correct module is loaded and the bug can’t be triggered any 
more even after 5 iterations (x10.000 files).


Therefore, I can conclude, that the fix itself is working correctly, however 
the package distributed in the -proposed repository does not include the 
correct kernel module. However, as this is the way most people are using ZFS on 
Ubuntu (instead of using the dkms module) this fix also has to be introduced in 
the current kernel package to fix the problem.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/2044657

Title:
  Multiple data corruption issues in zfs

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Xenial:
  Confirmed
Status in zfs-linux source package in Bionic:
  Confirmed
Status in zfs-linux source package in Focal:
  Fix Committed
Status in zfs-linux source package in Jammy:
  Fix Committed
Status in zfs-linux source package in Lunar:
  Won't Fix
Status in zfs-linux source package in Mantic:
  Fix Committed
Status in zfs-linux source package in Noble:
  Fix Released

Bug description:
  [ Impact ]

   * Multiple data corruption issues have been identified and fixed in
  ZFS. Some of them, at varying real-life reproducibility frequency have
  been deterimed to affect very old zfs releases. Recommendation is to
  upgrade to 2.2.2 or 2.1.14 or backport dnat patch alone. This is to
  ensure users get other potentially related fixes and runtime tunables
  to possibly mitigate other bugs that are related and are being fixed
  upstream for future releases.

   * For jammy the 2.1.14 upgrade will bring HWE kernel support and also
  compatiblity/support for hardened kernel builds that mitigate SLS
  (straight-line-speculation).

   * In the absence of the upgrade a cherry-pick will address this
  particular popular issue alone - without addressing other issues
  w.r.t. Redbleed / SLS, bugfixes around trim support, and other related
  improvements that were discovered and fixed around the same time as
  this popular issue.

  [ Test Plan ]

   * !!! Danger !!! use reproducer from
  https://zfsonlinux.topicbox.com/groups/zfs-discuss/T12876116b8607cdb
  and confirm if that issue is resolved or not. Do not run on production
  ZFS pools / systems.

   * autopkgtest pass (from https://ubuntu-archive-
  team.ubuntu.com/proposed-migration/ )

   * adt-matrix pass (from https://kernel.ubuntu.com/adt-matrix/ )

   * kernel regression zfs testsuite pass (from Kernel team RT test
  results summary, private)

   * zsys integration test pass (upgrade of zsys installed systems for
  all releases)

   * zsys install test pass (for daily images of LTS releases only that
  have such installer support, as per iso tracker test case)

   * LXD (ping LXD team to upgrade vendored in tooling to 2.2.2 and
  2.1.14, and test LXD on these updated kernels)

  [ Where problems could occur ]

   * Upgrade to 2.1.14 on jammy with SLS mitigations compatiblity will
  introduce slight slow down on amd64 (for hw accelerated assembly code-
  paths only in the encryption primitives)

   * Uncertain of the perfomance impact of the extra checks in dnat
  patch fix itself. Possibly affecting speed of operation, at the
  benefit of correctness.

   * The cherry-picked patch ("dnat"? dnode) changes the dirty data check, but
     only makes it stronger and not weaker, thus if it were incorrect, likely
     only performance would be impacted (and it is unlikely to be incorrect
     given upstream reviews and attention to data corruption issues; also,
     there are no additional changes to that function upstream)

  [ Other Info ]

   * https://github.com/openzfs/zfs/pull/15571 is most current
  consideration of affairs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/2044657/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to