** Description changed:

- I have a server that has been running its data volume using ZFS in 20.04
- without any problem. The volume is using ZFS encryption and a raidz1-0
- configuration. I performed a scrub operations before the upgrade and it
- did not find any problem. After the reboot for the upgrade, I was
- welcomed with the following message:
+ [ Impact ]
+ Upgrading from 20.04 to 22.04 causes encrypted pools to become unmountable. 
This
+ is due to broken accounting metadata causing checksum errors on decrypt, which
+ makes ZFS error out early with ECKSUM.
+ 
+ [ Test Plan ]
+ This issue needs specific accounting metadata on the zpool to be broken, and 
as
+ such is somewhat tricky to reproduce organically. A regular test plan for an
+ affected pool should be:
+ 1. Setup encrypted zpool under 20.04
+ 2. Upgrade system to 22.04 (e.g. using do-release-upgrade script)
+ 3. Verify that zpool fails to mount under 22.04 (zpool status will likely 
point
+    to ZFS-8000-8A "Corrupted data" [0])
+ 
+ [0] https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A/
+ 
+ Thankfully, upstream has included a test scenario for this under the ZFS test
+ suite, which is ran during build. The
+ tests/zfs-tests/tests/functional/userquota/13709_reproducer.bz2 file is taken
+ directly from upstream, and corresponds to an encrypted zpool with the 
required
+ (broken) metadata to reproduce this issue. If the ZFS test suite passes, this
+ should give us a strong signal that this isssue is fixed.
+ 
+ [ Where problems could occur ]
+ Although I've backported the upstream test, it'd be great to have confirmation
+ from affected users that this patch resolves the issue. Additionally, we 
should
+ also perform upgrades in non-affected zpools as well as non-encrypted zpools, 
to
+ ensure no regressions have been introduced.
+ 
+ Considering this change affects the encrypt/decrypt code paths, problems could
+ arise in creating new encrypted zpools, as well as when mounting zpools that
+ have been previously encrypted.
+ 
+ [ Other Info ]
+ This SRU includes a little more changes than the minimal changes mentioned in
+ the SRU policy, as I've also backported one of upstream's tests for encrypted
+ pools. This included a new test script (userspace_encrypted_13709.ksh), as 
well
+ as a binary zpool dump (13709_reproducer.bz2) that I've added under
+ d/s/include-binaries.
+ 
+ Considering this issue causes zpools to become unmountable, I think it's worth
+ to include these in the standard ZFS test suite (similar to an autopkgtest
+ scenario for a high-risk regression). These are included in future releases of
+ zfs-linux, and as such only Jammy is affected by this regression.
+ --
+ 
+ [ Original Description ]
+ I have a server that has been running its data volume using ZFS in 20.04 
without any problem. The volume is using ZFS encryption and a raidz1-0 
configuration. I performed a scrub operations before the upgrade and it did not 
find any problem. After the reboot for the upgrade, I was welcomed with the 
following message:
  
  status: One or more devices has experienced an error resulting in data
-         corruption.  Applications may be affected.
+         corruption.  Applications may be affected.
  action: Restore the file in question if possible.  Otherwise restore the
-         entire pool from backup.
-    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
+         entire pool from backup.
+    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  
  The volumes still do not have any checksum error but there are 5 zvols
  that are not accessible. zpool status displays a line similar to the
  below for each of the five:
  
- errors: Permanent errors have been detected in the following files:           
                            
-                                                                               
                            
-         tank/data/data:<0x0>
+ errors: Permanent errors have been detected in the following files:
+ 
+         tank/data/data:<0x0>
  
  I run a scrub and it has not identified any problem but the error
  messages are not there and the data is still not available. There are
  10+ other zvols in the zpool that do not have any kind of problem. I
  have been unable to identify any correlation between the zvols that are
  failing.
  
  I have seen people reporting similar problems in github after the 20.04
  to the 22.04 upgrade (see https://github.com/openzfs/zfs/issues/13763).
  I wonder how widespread the problem will be as more people upgrades to
  22.04.
  
  I will try to downgrade the version of zfs in the system and report back
  
  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: zfsutils-linux 2.1.4-0ubuntu0.1
  ProcVersionSignature: Ubuntu 5.15.0-46.49-generic 5.15.39
  Uname: Linux 5.15.0-46-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu82.1
  Architecture: amd64
  CasperMD5CheckResult: unknown
  Date: Sat Aug 20 22:24:54 2022
  ProcEnviron:
-  TERM=screen-256color
-  PATH=(custom, no user)
-  XDG_RUNTIME_DIR=<set>
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  TERM=screen-256color
+  PATH=(custom, no user)
+  XDG_RUNTIME_DIR=<set>
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  SourcePackage: zfs-linux
  UpgradeStatus: Upgraded to jammy on 2022-08-20 (0 days ago)
  modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission 
denied: '/etc/sudoers.d/zfs']

** Also affects: zfs-linux (Ubuntu Jammy)
   Importance: Undecided
       Status: New

** Changed in: zfs-linux (Ubuntu Jammy)
     Assignee: (unassigned) => Heitor Alves de Siqueira (halves)

** Changed in: zfs-linux (Ubuntu Jammy)
   Importance: Undecided => High

** Changed in: zfs-linux (Ubuntu Jammy)
       Status: New => Incomplete

** Changed in: zfs-linux (Ubuntu Jammy)
       Status: Incomplete => In Progress

** Changed in: zfs-linux (Ubuntu)
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1987190

Title:
  ZFS unrecoverable error after upgrading from 20.04 to 22.04.1

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Jammy:
  In Progress

Bug description:
  [ Impact ]
  Upgrading from 20.04 to 22.04 causes encrypted pools to become unmountable. 
This
  is due to broken accounting metadata causing checksum errors on decrypt, which
  makes ZFS error out early with ECKSUM.

  [ Test Plan ]
  This issue needs specific accounting metadata on the zpool to be broken, and 
as
  such is somewhat tricky to reproduce organically. A regular test plan for an
  affected pool should be:
  1. Setup encrypted zpool under 20.04
  2. Upgrade system to 22.04 (e.g. using do-release-upgrade script)
  3. Verify that zpool fails to mount under 22.04 (zpool status will likely 
point
     to ZFS-8000-8A "Corrupted data" [0])

  [0] https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A/

  Thankfully, upstream has included a test scenario for this under the ZFS test
  suite, which is ran during build. The
  tests/zfs-tests/tests/functional/userquota/13709_reproducer.bz2 file is taken
  directly from upstream, and corresponds to an encrypted zpool with the 
required
  (broken) metadata to reproduce this issue. If the ZFS test suite passes, this
  should give us a strong signal that this isssue is fixed.

  [ Where problems could occur ]
  Although I've backported the upstream test, it'd be great to have confirmation
  from affected users that this patch resolves the issue. Additionally, we 
should
  also perform upgrades in non-affected zpools as well as non-encrypted zpools, 
to
  ensure no regressions have been introduced.

  Considering this change affects the encrypt/decrypt code paths, problems could
  arise in creating new encrypted zpools, as well as when mounting zpools that
  have been previously encrypted.

  [ Other Info ]
  This SRU includes a little more changes than the minimal changes mentioned in
  the SRU policy, as I've also backported one of upstream's tests for encrypted
  pools. This included a new test script (userspace_encrypted_13709.ksh), as 
well
  as a binary zpool dump (13709_reproducer.bz2) that I've added under
  d/s/include-binaries.

  Considering this issue causes zpools to become unmountable, I think it's worth
  to include these in the standard ZFS test suite (similar to an autopkgtest
  scenario for a high-risk regression). These are included in future releases of
  zfs-linux, and as such only Jammy is affected by this regression.
  --

  [ Original Description ]
  I have a server that has been running its data volume using ZFS in 20.04 
without any problem. The volume is using ZFS encryption and a raidz1-0 
configuration. I performed a scrub operations before the upgrade and it did not 
find any problem. After the reboot for the upgrade, I was welcomed with the 
following message:

  status: One or more devices has experienced an error resulting in data
          corruption.  Applications may be affected.
  action: Restore the file in question if possible.  Otherwise restore the
          entire pool from backup.
     see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A

  The volumes still do not have any checksum error but there are 5 zvols
  that are not accessible. zpool status displays a line similar to the
  below for each of the five:

  errors: Permanent errors have been detected in the following files:

          tank/data/data:<0x0>

  I run a scrub and it has not identified any problem but the error
  messages are not there and the data is still not available. There are
  10+ other zvols in the zpool that do not have any kind of problem. I
  have been unable to identify any correlation between the zvols that
  are failing.

  I have seen people reporting similar problems in github after the
  20.04 to the 22.04 upgrade (see
  https://github.com/openzfs/zfs/issues/13763). I wonder how widespread
  the problem will be as more people upgrades to 22.04.

  I will try to downgrade the version of zfs in the system and report
  back

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: zfsutils-linux 2.1.4-0ubuntu0.1
  ProcVersionSignature: Ubuntu 5.15.0-46.49-generic 5.15.39
  Uname: Linux 5.15.0-46-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu82.1
  Architecture: amd64
  CasperMD5CheckResult: unknown
  Date: Sat Aug 20 22:24:54 2022
  ProcEnviron:
   TERM=screen-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: zfs-linux
  UpgradeStatus: Upgraded to jammy on 2022-08-20 (0 days ago)
  modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission 
denied: '/etc/sudoers.d/zfs']

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1987190/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to