Bug#1086617: bookworm-pu: package zfs-linux/2.1.11-1+deb12u1

Shengqi Chen Sat, 02 Nov 2024 01:09:47 -0700

Package: release.debian.org
Severity: normal
Tags: bookworm
User: release.debian....@packages.debian.org
Usertags: pu
X-Debbugs-Cc: zfs-li...@packages.debian.org, a...@debian.org
Control: affects -1 + src:zfs-linux
Control: block 1063497 by -1
Control: block 1069125 by -1


[ Reason ]

zfs in bookworm (2.1.11) suffers from CVE-2023-49298 (data corruption)
and CVE-2013-20001 (nfs sharing security problem).

[ Impact ]

They are still affected by these CVEs, although risks are not grave.

[ Tests ]

These patches are in unstable / bookworm-bpo for a long time.
No problems are reported against them.

[ Risks ]

Risks are minimal. I only cherry-pick fixes, no functionality changes.

[ Checklist ]
  [x] *all* changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in (old)stable
  [x] the issue is verified as fixed in unstable

[ Changes ]

  * dch: typo fix
  * New symbols for libzfs4linux and libzpool5linux
    (missing in last upload)
  * d/patches: cherry-pick upstream fixes for stability issues
    + fix dnode dirty test (Closes: #1056752, #1063497, CVE-2023-49298)
    + fix sharenfx IPv6 address parsing (Closes: CVE-2013-20001)
    + and some fixes related to NULL pointer, memory allocation, etc.

[ Other info ]

This request is similar to #1042730, but removing many non-essential
patches. The remaining patches containing ~100 LOC of pure changes,
and most of other changes are commit messages.

-- 
Thanks,
Shengqi Chen

diff -Nru zfs-linux-2.1.11/debian/changelog zfs-linux-2.1.11/debian/changelog
--- zfs-linux-2.1.11/debian/changelog   2023-04-23 17:29:38.000000000 +0800
+++ zfs-linux-2.1.11/debian/changelog   2024-11-02 15:34:23.000000000 +0800
@@ -1,3 +1,14 @@
+zfs-linux (2.1.11-1+deb12u1) UNRELEASED; urgency=medium
+
+  * dch: typo fix
+  * New symbols for libzfs4linux and libzpool5linux
+  * d/patches: cherry-pick upstream fixes for stability issues
+    + fix dnode dirty test (Closes: #1056752, #1063497, CVE-2023-49298)
+    + fix sharenfx IPv6 address parsing (Closes: CVE-2013-20001)
+    + and some fixes related to NULL pointer, memory allocation, etc.
+
+ -- Shengqi Chen <harry-c...@outlook.com>  Sat, 02 Nov 2024 15:34:23 +0800
+
 zfs-linux (2.1.11-1) unstable; urgency=medium
 
   [ Mo Zhou ]
@@ -5,7 +16,7 @@
 
   [ Aron Xu ]
   * New upstream stable point release version 2.1.11
-  * Drop patches that are alreay in upstream stable release
+  * Drop patches that are already in upstream stable release
 
  -- Aron Xu <a...@debian.org>  Sun, 23 Apr 2023 17:29:38 +0800
 
diff -Nru zfs-linux-2.1.11/debian/libzfs4linux.symbols 
zfs-linux-2.1.11/debian/libzfs4linux.symbols
--- zfs-linux-2.1.11/debian/libzfs4linux.symbols        2023-04-17 
12:44:44.000000000 +0800
+++ zfs-linux-2.1.11/debian/libzfs4linux.symbols        2024-11-02 
15:27:19.000000000 +0800
@@ -102,6 +102,7 @@
  snapshot_namecheck@Base 2.0
  spa_feature_table@Base 0.8.2
  unshare_one@Base 2.0
+ use_color@Base 2.1.11
  zcmd_alloc_dst_nvlist@Base 0.8.2
  zcmd_expand_dst_nvlist@Base 0.8.2
  zcmd_free_nvlists@Base 0.8.2
@@ -386,6 +387,7 @@
  zpool_vdev_path_to_guid@Base 0.8.2
  zpool_vdev_remove@Base 0.8.2
  zpool_vdev_remove_cancel@Base 0.8.2
+ zpool_vdev_remove_wanted@Base 2.1.11
  zpool_vdev_split@Base 0.8.2
  zpool_wait@Base 2.0
  zpool_wait_status@Base 2.0
@@ -678,6 +680,8 @@
  zfs_niceraw@Base 2.0
  zfs_nicetime@Base 2.0
  zfs_resolve_shortname@Base 2.0
+ zfs_setproctitle@Base 2.1.11
+ zfs_setproctitle_init@Base 2.1.11
  zfs_strcmp_pathname@Base 2.0
  zfs_strip_partition@Base 2.0
  zfs_strip_path@Base 2.0
diff -Nru zfs-linux-2.1.11/debian/libzpool5linux.symbols 
zfs-linux-2.1.11/debian/libzpool5linux.symbols
--- zfs-linux-2.1.11/debian/libzpool5linux.symbols      2023-04-17 
15:26:55.000000000 +0800
+++ zfs-linux-2.1.11/debian/libzpool5linux.symbols      2024-11-02 
15:27:19.000000000 +0800
@@ -685,6 +685,7 @@
  dnode_special_close@Base 0.8.2
  dnode_special_open@Base 0.8.2
  dnode_stats@Base 0.8.2
+ dnode_sums@Base 2.1.11
  dnode_sync@Base 0.8.2
  dnode_try_claim@Base 0.8.2
  dnode_verify@Base 2.0
@@ -2095,6 +2096,7 @@
  vdev_checkpoint_sm_object@Base 0.8.2
  vdev_children_are_offline@Base 0.8.2
  vdev_clear@Base 0.8.2
+ vdev_clear_kobj_evt@Base 2.1.11
  vdev_clear_resilver_deferred@Base 0.8.3
  vdev_clear_stats@Base 0.8.2
  vdev_close@Base 0.8.2
@@ -2227,6 +2229,7 @@
  vdev_open@Base 0.8.2
  vdev_open_children@Base 0.8.2
  vdev_open_children_subset@Base 2.1
+ vdev_post_kobj_evt@Base 2.1.11
  vdev_probe@Base 0.8.2
  vdev_propagate_state@Base 0.8.2
  vdev_psize_to_asize@Base 0.8.2
@@ -2277,6 +2280,7 @@
  vdev_removal_max_span@Base 0.8.2
  vdev_remove_child@Base 0.8.2
  vdev_remove_parent@Base 0.8.2
+ vdev_remove_wanted@Base 2.1.11
  vdev_reopen@Base 0.8.2
  vdev_replace_in_progress@Base 2.0
  vdev_replacing_ops@Base 0.8.2
diff -Nru 
zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
 
zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
--- 
zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
     1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
     2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,91 @@
+From a68dfdb88c88fe970343e49b48bfd3bb4cef99d2 Mon Sep 17 00:00:00 2001
+From: Ameer Hamza <106930537+ixha...@users.noreply.github.com>
+Date: Wed, 19 Apr 2023 21:04:32 +0500
+Subject: [PATCH] Fix "Detach spare vdev in case if resilvering does not
+ happen"
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Spare vdev should detach from the pool when a disk is reinserted.
+However, spare detachment depends on the completion of resilvering,
+and if resilver does not schedule, the spare vdev keeps attached to
+the pool until the next resilvering. When a zfs pool contains
+several disks (25+ mirror), resilvering does not always happen when
+a disk is reinserted. In this patch, spare vdev is manually detached
+from the pool when resilvering does not occur and it has been tested
+on both Linux and FreeBSD.
+
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Reviewed-by: Alexander Motin <m...@freebsd.org>
+Signed-off-by: Ameer Hamza <aha...@ixsystems.com>
+Closes #14722
+---
+ include/sys/spa.h |  1 +
+ module/zfs/spa.c  |  5 +++--
+ module/zfs/vdev.c | 12 +++++++++++-
+ 3 files changed, 15 insertions(+), 3 deletions(-)
+
+diff --git a/include/sys/spa.h b/include/sys/spa.h
+index fedadab45..07e09d1ec 100644
+--- a/include/sys/spa.h
++++ b/include/sys/spa.h
+@@ -785,6 +785,7 @@ extern int bpobj_enqueue_free_cb(void *arg, const blkptr_t 
*bp, dmu_tx_t *tx);
+ #define       SPA_ASYNC_L2CACHE_REBUILD               0x800
+ #define       SPA_ASYNC_L2CACHE_TRIM                  0x1000
+ #define       SPA_ASYNC_REBUILD_DONE                  0x2000
++#define       SPA_ASYNC_DETACH_SPARE                  0x4000
+ 
+ /* device manipulation */
+ extern int spa_vdev_add(spa_t *spa, nvlist_t *nvroot);
+diff --git a/module/zfs/spa.c b/module/zfs/spa.c
+index 1ed79eed3..8bc51f777 100644
+--- a/module/zfs/spa.c
++++ b/module/zfs/spa.c
+@@ -6987,7 +6987,7 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t 
*nvroot, int replacing,
+  * Detach a device from a mirror or replacing vdev.
+  *
+  * If 'replace_done' is specified, only detach if the parent
+- * is a replacing vdev.
++ * is a replacing or a spare vdev.
+  */
+ int
+ spa_vdev_detach(spa_t *spa, uint64_t guid, uint64_t pguid, int replace_done)
+@@ -8210,7 +8210,8 @@ spa_async_thread(void *arg)
+        * If any devices are done replacing, detach them.
+        */
+       if (tasks & SPA_ASYNC_RESILVER_DONE ||
+-          tasks & SPA_ASYNC_REBUILD_DONE) {
++          tasks & SPA_ASYNC_REBUILD_DONE ||
++          tasks & SPA_ASYNC_DETACH_SPARE) {
+               spa_vdev_resilver_done(spa);
+       }
+ 
+diff --git a/module/zfs/vdev.c b/module/zfs/vdev.c
+index 4b9d7e7c0..ee0c1d862 100644
+--- a/module/zfs/vdev.c
++++ b/module/zfs/vdev.c
+@@ -4085,9 +4085,19 @@ vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, 
vdev_state_t *newstate)
+ 
+       if (wasoffline ||
+           (oldstate < VDEV_STATE_DEGRADED &&
+-          vd->vdev_state >= VDEV_STATE_DEGRADED))
++          vd->vdev_state >= VDEV_STATE_DEGRADED)) {
+               spa_event_notify(spa, vd, NULL, ESC_ZFS_VDEV_ONLINE);
+ 
++              /*
++               * Asynchronously detach spare vdev if resilver or
++               * rebuild is not required
++               */
++              if (vd->vdev_unspare &&
++                  !dsl_scan_resilvering(spa->spa_dsl_pool) &&
++                  !dsl_scan_resilver_scheduled(spa->spa_dsl_pool) &&
++                  !vdev_rebuild_active(tvd))
++                      spa_async_request(spa, SPA_ASYNC_DETACH_SPARE);
++      }
+       return (spa_vdev_state_exit(spa, vd, 0));
+ }
+ 
+-- 
+2.39.2
+
diff -Nru 
zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
 
zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
--- 
zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
     1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
     2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,65 @@
+From 671b1af1bc4b20ddd939c2ede22748bd027d30be Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?Lu=C3=ADs=20Henriques?=
+ <73643340+lumi...@users.noreply.github.com>
+Date: Tue, 30 May 2023 23:15:24 +0100
+Subject: [PATCH] Fix NULL pointer dereference when doing concurrent 'send'
+ operations
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+A NULL pointer will occur when doing a 'zfs send -S' on a dataset that
+is still being received.  The problem is that the new 'send' will
+rightfully fail to own the datasets (i.e. dsl_dataset_own_force() will
+fail), but then dmu_send() will still do the dsl_dataset_disown().
+
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Signed-off-by: Luís Henriques <hen...@camandro.org>
+Closes #14903
+Closes #14890
+---
+ module/zfs/dmu_send.c | 8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dmu_send.c b/module/zfs/dmu_send.c
+index cd9ecc07f..0dd1ec210 100644
+--- a/module/zfs/dmu_send.c
++++ b/module/zfs/dmu_send.c
+@@ -2797,6 +2797,7 @@ dmu_send(const char *tosnap, const char *fromsnap, 
boolean_t embedok,
+                       }
+ 
+                       if (err == 0) {
++                              owned = B_TRUE;
+                               err = zap_lookup(dspp.dp->dp_meta_objset,
+                                   dspp.to_ds->ds_object,
+                                   DS_FIELD_RESUME_TOGUID, 8, 1,
+@@ -2810,21 +2811,24 @@ dmu_send(const char *tosnap, const char *fromsnap, 
boolean_t embedok,
+                                   sizeof (dspp.saved_toname),
+                                   dspp.saved_toname);
+                       }
+-                      if (err != 0)
++                      /* Only disown if there was an error in the lookups */
++                      if (owned && (err != 0))
+                               dsl_dataset_disown(dspp.to_ds, dsflags, FTAG);
+ 
+                       kmem_strfree(name);
+               } else {
+                       err = dsl_dataset_own(dspp.dp, tosnap, dsflags,
+                           FTAG, &dspp.to_ds);
++                      if (err == 0)
++                              owned = B_TRUE;
+               }
+-              owned = B_TRUE;
+       } else {
+               err = dsl_dataset_hold_flags(dspp.dp, tosnap, dsflags, FTAG,
+                   &dspp.to_ds);
+       }
+ 
+       if (err != 0) {
++              /* Note: dsl dataset is not owned at this point */
+               dsl_pool_rele(dspp.dp, FTAG);
+               return (err);
+       }
+-- 
+2.39.2
+
diff -Nru 
zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
 
zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
--- 
zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
  1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
  2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,45 @@
+From 93a99c6daae6e8c126ead2bcf331e5772c966cc7 Mon Sep 17 00:00:00 2001
+From: Rich Ercolani <214141+rincebr...@users.noreply.github.com>
+Date: Wed, 31 May 2023 19:58:41 -0400
+Subject: [PATCH] Revert "initramfs: use `mount.zfs` instead of `mount`"
+
+This broke mounting of snapshots on / for users.
+
+See https://github.com/openzfs/zfs/issues/9461#issuecomment-1376162949 for 
more context.
+
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Signed-off-by: Rich Ercolani <rincebr...@gmail.com>
+Closes #14908
+---
+ contrib/initramfs/scripts/zfs | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+--- a/contrib/initramfs/scripts/zfs
++++ b/contrib/initramfs/scripts/zfs
+@@ -342,7 +342,7 @@
+ 
+       # Need the _original_ datasets mountpoint!
+       mountpoint=$(get_fs_value "$fs" mountpoint)
+-      ZFS_CMD="mount.zfs -o zfsutil"
++      ZFS_CMD="mount -o zfsutil -t zfs"
+       if [ "$mountpoint" = "legacy" ] || [ "$mountpoint" = "none" ]; then
+               # Can't use the mountpoint property. Might be one of our
+               # clones. Check the 'org.zol:mountpoint' property set in
+@@ -359,7 +359,7 @@
+                       fi
+                       # Don't use mount.zfs -o zfsutils for legacy mountpoint
+                       if [ "$mountpoint" = "legacy" ]; then
+-                              ZFS_CMD="mount.zfs"
++                              ZFS_CMD="mount -t zfs"
+                       fi
+                       # Last hail-mary: Hope 'rootmnt' is set!
+                       mountpoint=""
+@@ -930,7 +930,7 @@
+               echo "       not specified on the kernel command line."
+               echo ""
+               echo "Manually mount the root filesystem on $rootmnt and then 
exit."
+-              echo "Hint: Try:  mount.zfs -o zfsutil 
${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
++              echo "Hint: Try:  mount -o zfsutil -t zfs 
${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
+               shell
+       fi
+ 
diff -Nru 
zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
 
zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
--- 
zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
   1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
   2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,31 @@
+From b01a8cc2c0fe6ee4af05bb0b1911afcbd39da64b Mon Sep 17 00:00:00 2001
+From: Alexander Motin <m...@freebsd.org>
+Date: Thu, 11 May 2023 17:27:12 -0400
+Subject: [PATCH] zil: Don't expect zio_shrink() to succeed.
+
+At least for RAIDZ zio_shrink() does not reduce zio size, but reduced
+wsz in that case likely results in writing uninitialized memory.
+
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Signed-off-by:  Alexander Motin <m...@freebsd.org>
+Sponsored by:   iXsystems, Inc.
+Closes #14853
+---
+ module/zfs/zil.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/module/zfs/zil.c b/module/zfs/zil.c
+index 0456d3801..cca061040 100644
+--- a/module/zfs/zil.c
++++ b/module/zfs/zil.c
+@@ -1593,6 +1593,7 @@ zil_lwb_write_issue(zilog_t *zilog, lwb_t *lwb)
+               wsz = P2ROUNDUP_TYPED(lwb->lwb_nused, ZIL_MIN_BLKSZ, uint64_t);
+               ASSERT3U(wsz, <=, lwb->lwb_sz);
+               zio_shrink(lwb->lwb_write_zio, wsz);
++              wsz = lwb->lwb_write_zio->io_size;
+ 
+       } else {
+               wsz = lwb->lwb_sz;
+-- 
+2.39.2
+
diff -Nru 
zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
 
zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
--- 
zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
     1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
     2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,48 @@
+From 837e426c1f302e580a18a213fd216322f480caf8 Mon Sep 17 00:00:00 2001
+From: Brian Behlendorf <behlendo...@llnl.gov>
+Date: Wed, 7 Jun 2023 10:43:43 -0700
+Subject: [PATCH] Linux: Never sleep in kmem_cache_alloc(..., KM_NOSLEEP)
+ (#14926)
+
+When a kmem cache is exhausted and needs to be expanded a new
+slab is allocated.  KM_SLEEP callers can block and wait for the
+allocation, but KM_NOSLEEP callers were incorrectly allowed to
+block as well.
+
+Resolve this by attempting an emergency allocation as a best
+effort.  This may fail but that's fine since any KM_NOSLEEP
+consumer is required to handle an allocation failure.
+
+Signed-off-by: Brian Behlendorf <behlendo...@llnl.gov>
+Reviewed-by: Adam Moss <c...@yotes.com>
+Reviewed-by: Brian Atkinson <batkin...@lanl.gov>
+Reviewed-by: Richard Yao <richard....@alumni.stonybrook.edu>
+Reviewed-by: Tony Hutter <hutt...@llnl.gov>
+---
+ module/os/linux/spl/spl-kmem-cache.c | 12 +++++++++++-
+ 1 file changed, 11 insertions(+), 1 deletion(-)
+
+--- a/module/os/linux/spl/spl-kmem-cache.c
++++ b/module/os/linux/spl/spl-kmem-cache.c
+@@ -1017,10 +1017,20 @@
+       ASSERT0(flags & ~KM_PUBLIC_MASK);
+       ASSERT(skc->skc_magic == SKC_MAGIC);
+       ASSERT((skc->skc_flags & KMC_SLAB) == 0);
+-      might_sleep();
++
+       *obj = NULL;
+ 
+       /*
++       * Since we can't sleep attempt an emergency allocation to satisfy
++       * the request.  The only alterative is to fail the allocation but
++       * it's preferable try.  The use of KM_NOSLEEP is expected to be rare.
++       */
++      if (flags & KM_NOSLEEP)
++              return (spl_emergency_alloc(skc, flags, obj));
++
++      might_sleep();
++
++      /*
+        * Before allocating a new slab wait for any reaping to complete and
+        * then return so the local magazine can be rechecked for new objects.
+        */
diff -Nru 
zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
 
zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
--- 
zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
     1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
     2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,93 @@
+From 77b0c6f0403b2b7d145bf6c244b6acbc757ccdc9 Mon Sep 17 00:00:00 2001
+From: Rob N <r...@despairlabs.com>
+Date: Wed, 29 Nov 2023 04:16:49 +1100
+Subject: [PATCH] dnode_is_dirty: check dnode and its data for dirtiness
+
+Over its history this the dirty dnode test has been changed between
+checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
+`dn_dirty_record`.
+
+  de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
+  2531ce372 Revert "Report holes when there are only metadata changes"
+  ec4f9b8f3 Report holes when there are only metadata changes
+  454365bba Fix dirty check in dmu_offset_next()
+  66aca2473 SEEK_HOLE should not block on txg_wait_synced()
+
+Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9
+
+It turns out both are actually required.
+
+In the case of appending data to a newly created file, the dnode proper
+is dirtied (at least to change the blocksize) and dirty records are
+added.  Thus, a single logical operation is represented by separate
+dirty indicators, and must not be separated.
+
+The incorrect dirty check becomes a problem when the first block of a
+file is being appended to while another process is calling lseek to skip
+holes. There is a small window where the dnode part is undirtied while
+there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
+would not know that the file is dirty, and would go to
+`dnode_next_offset()`. Since the object has no data blocks yet, it
+returns `ESRCH`, indicating no data found, which results in `ENXIO`
+being returned to `lseek()`'s caller.
+
+Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
+uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
+replicate the holes in the target. When it hits the bug, its initial
+search for data fails, and it goes on to call `fallocate()` to create a
+hole over the entire destination file.
+
+This has come up more recently as users upgrade their systems, getting
+OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
+reproduced against 2.1, as well as on FreeBSD 13 and 14.
+
+This change simply updates the dirty check to check both types of dirty.
+If there's anything dirty at all, we immediately go to the "wait for
+sync" stage, It doesn't really matter after that; both changes are on
+disk, so the dirty fields should be correct.
+
+Sponsored-by: Klara, Inc.
+Sponsored-by: Wasabi Technology, Inc.
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Reviewed-by: Alexander Motin <m...@freebsd.org>
+Reviewed-by: Rich Ercolani <rincebr...@gmail.com>
+Signed-off-by: Rob Norris <rob.nor...@klarasystems.com>
+Closes #15571
+Closes #15526
+---
+ module/zfs/dnode.c | 12 ++++++++++--
+ 1 file changed, 10 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
+index a9aaa4d21..efebc443a 100644
+--- a/module/zfs/dnode.c
++++ b/module/zfs/dnode.c
+@@ -1773,7 +1773,14 @@ dnode_try_claim(objset_t *os, uint64_t object, int 
slots)
+ }
+ 
+ /*
+- * Checks if the dnode contains any uncommitted dirty records.
++ * Checks if the dnode itself is dirty, or is carrying any uncommitted 
records.
++ * It is important to check both conditions, as some operations (eg appending
++ * to a file) can dirty both as a single logical unit, but they are not synced
++ * out atomically, so checking one and not the other can result in an object
++ * appearing to be clean mid-way through a commit.
++ *
++ * Do not change this lightly! If you get it wrong, dmu_offset_next() can
++ * detect a hole where there is really data, leading to silent corruption.
+  */
+ boolean_t
+ dnode_is_dirty(dnode_t *dn)
+@@ -1781,7 +1788,8 @@ dnode_is_dirty(dnode_t *dn)
+       mutex_enter(&dn->dn_mtx);
+ 
+       for (int i = 0; i < TXG_SIZE; i++) {
+-              if (multilist_link_active(&dn->dn_dirty_link[i])) {
++              if (multilist_link_active(&dn->dn_dirty_link[i]) ||
++                  !list_is_empty(&dn->dn_dirty_records[i])) {
+                       mutex_exit(&dn->dn_mtx);
+                       return (B_TRUE);
+               }
+-- 
+2.39.2
+
diff -Nru 
zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
 
zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
--- 
zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
     1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
     2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,104 @@
+From 1ca531971f176ae7b8ca440e836985ae1d7fa0ec Mon Sep 17 00:00:00 2001
+From: Jason King <jasonbk...@users.noreply.github.com>
+Date: Thu, 12 Oct 2023 13:01:54 -0500
+Subject: [PATCH] Zpool can start allocating from metaslab before TRIMs have
+ completed
+
+When doing a manual TRIM on a zpool, the metaslab being TRIMmed is
+potentially re-enabled before all queued TRIM zios for that metaslab
+have completed. Since TRIM zios have the lowest priority, it is
+possible to get into a situation where allocations occur from the
+just re-enabled metaslab and cut ahead of queued TRIMs to the same
+metaslab.  If the ranges overlap, this will cause corruption.
+
+We were able to trigger this pretty consistently with a small single
+top-level vdev zpool (i.e. small number of metaslabs) with heavy
+parallel write activity while performing a manual TRIM against a
+somewhat 'slow' device (so TRIMs took a bit of time to complete).
+With the patch, we've not been able to recreate it since. It was on
+illumos, but inspection of the OpenZFS trim code looks like the
+relevant pieces are largely unchanged and so it appears it would be
+vulnerable to the same issue.
+
+Reviewed-by: Igor Kozhukhov <i...@dilos.org>
+Reviewed-by: Alexander Motin <m...@freebsd.org>
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Signed-off-by: Jason King <jk...@racktopsystems.com>
+Illumos-issue: https://www.illumos.org/issues/15939
+Closes #15395
+---
+ module/zfs/vdev_trim.c | 28 +++++++++++++++++++---------
+ 1 file changed, 19 insertions(+), 9 deletions(-)
+
+diff --git a/module/zfs/vdev_trim.c b/module/zfs/vdev_trim.c
+index 92daed48f..c0ce2ac28 100644
+--- a/module/zfs/vdev_trim.c
++++ b/module/zfs/vdev_trim.c
+@@ -23,6 +23,7 @@
+  * Copyright (c) 2016 by Delphix. All rights reserved.
+  * Copyright (c) 2019 by Lawrence Livermore National Security, LLC.
+  * Copyright (c) 2021 Hewlett Packard Enterprise Development LP
++ * Copyright 2023 RackTop Systems, Inc.
+  */
+ 
+ #include <sys/spa.h>
+@@ -572,6 +573,7 @@ vdev_trim_ranges(trim_args_t *ta)
+       uint64_t extent_bytes_max = ta->trim_extent_bytes_max;
+       uint64_t extent_bytes_min = ta->trim_extent_bytes_min;
+       spa_t *spa = vd->vdev_spa;
++      int error = 0;
+ 
+       ta->trim_start_time = gethrtime();
+       ta->trim_bytes_done = 0;
+@@ -591,19 +593,32 @@ vdev_trim_ranges(trim_args_t *ta)
+               uint64_t writes_required = ((size - 1) / extent_bytes_max) + 1;
+ 
+               for (uint64_t w = 0; w < writes_required; w++) {
+-                      int error;
+-
+                       error = vdev_trim_range(ta, VDEV_LABEL_START_SIZE +
+                           rs_get_start(rs, ta->trim_tree) +
+                           (w *extent_bytes_max), MIN(size -
+                           (w * extent_bytes_max), extent_bytes_max));
+                       if (error != 0) {
+-                              return (error);
++                              goto done;
+                       }
+               }
+       }
+ 
+-      return (0);
++done:
++      /*
++       * Make sure all TRIMs for this metaslab have completed before
++       * returning. TRIM zios have lower priority over regular or syncing
++       * zios, so all TRIM zios for this metaslab must complete before the
++       * metaslab is re-enabled. Otherwise it's possible write zios to
++       * this metaslab could cut ahead of still queued TRIM zios for this
++       * metaslab causing corruption if the ranges overlap.
++       */
++      mutex_enter(&vd->vdev_trim_io_lock);
++      while (vd->vdev_trim_inflight[0] > 0) {
++              cv_wait(&vd->vdev_trim_io_cv, &vd->vdev_trim_io_lock);
++      }
++      mutex_exit(&vd->vdev_trim_io_lock);
++
++      return (error);
+ }
+ 
+ static void
+@@ -922,11 +937,6 @@ vdev_trim_thread(void *arg)
+       }
+ 
+       spa_config_exit(spa, SCL_CONFIG, FTAG);
+-      mutex_enter(&vd->vdev_trim_io_lock);
+-      while (vd->vdev_trim_inflight[0] > 0) {
+-              cv_wait(&vd->vdev_trim_io_cv, &vd->vdev_trim_io_lock);
+-      }
+-      mutex_exit(&vd->vdev_trim_io_lock);
+ 
+       range_tree_destroy(ta.trim_tree);
+ 
+-- 
+2.39.2
+
diff -Nru 
zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
 
zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
--- 
zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
      1970-01-01 08:00:00.000000000 +0800
+++ 
zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
      2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,94 @@
+From 6cb5e1e7591da20af3a15793e022345a73e40fb7 Mon Sep 17 00:00:00 2001
+From: felixdoerre <felixdoe...@users.noreply.github.com>
+Date: Wed, 20 Oct 2021 19:40:00 +0200
+Subject: [PATCH] libshare: nfs: pass through ipv6 addresses in bracket
+ notation
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Recognize when the host part of a sharenfs attribute is an ipv6
+Literal and pass that through without modification.
+
+Reviewed-by: Brian Behlendorf <behlendo...@llnl.gov>
+Signed-off-by: Felix Dörre <fe...@dogcraft.de>
+Closes: #11171
+Closes #11939
+Closes: #1894
+---
+--- a/lib/libshare/os/linux/nfs.c
++++ b/lib/libshare/os/linux/nfs.c
+@@ -180,8 +180,9 @@
+ {
+       int error;
+       const char *access;
+-      char *host_dup, *host, *next;
++      char *host_dup, *host, *next, *v6Literal;
+       nfs_host_cookie_t *udata = (nfs_host_cookie_t *)pcookie;
++      int cidr_len;
+ 
+ #ifdef DEBUG
+       fprintf(stderr, "foreach_nfs_host_cb: key=%s, value=%s\n", opt, value);
+@@ -204,10 +205,46 @@
+               host = host_dup;
+ 
+               do {
+-                      next = strchr(host, ':');
+-                      if (next != NULL) {
+-                              *next = '\0';
+-                              next++;
++                      if (*host == '[') {
++                              host++;
++                              v6Literal = strchr(host, ']');
++                              if (v6Literal == NULL) {
++                                      free(host_dup);
++                                      return (SA_SYNTAX_ERR);
++                              }
++                              if (v6Literal[1] == '\0') {
++                                      *v6Literal = '\0';
++                                      next = NULL;
++                              } else if (v6Literal[1] == '/') {
++                                      next = strchr(v6Literal + 2, ':');
++                                      if (next == NULL) {
++                                              cidr_len =
++                                                  strlen(v6Literal + 1);
++                                              memmove(v6Literal,
++                                                  v6Literal + 1,
++                                                  cidr_len);
++                                              v6Literal[cidr_len] = '\0';
++                                      } else {
++                                              cidr_len = next - v6Literal - 1;
++                                              memmove(v6Literal,
++                                                  v6Literal + 1,
++                                                  cidr_len);
++                                              v6Literal[cidr_len] = '\0';
++                                              next++;
++                                      }
++                              } else if (v6Literal[1] == ':') {
++                                      *v6Literal = '\0';
++                                      next = v6Literal + 2;
++                              } else {
++                                      free(host_dup);
++                                      return (SA_SYNTAX_ERR);
++                              }
++                      } else {
++                              next = strchr(host, ':');
++                              if (next != NULL) {
++                                      *next = '\0';
++                                      next++;
++                              }
+                       }
+ 
+                       error = udata->callback(udata->filename,
+--- a/man/man8/zfs.8
++++ b/man/man8/zfs.8
+@@ -545,7 +545,7 @@
+ on the
+ .Ar tank/home
+ file system:
+-.Dl # Nm zfs Cm set Sy sharenfs Ns = Ns ' Ns Ar rw Ns =@123.123.0.0/16,root= 
Ns Ar neo Ns ' tank/home
++.Dl # Nm zfs Cm set Sy sharenfs Ns = Ns ' Ns Ar rw Ns 
=@123.123.0.0/16:[::1],root= Ns Ar neo Ns ' tank/home
+ .Pp
+ If you are using DNS for host name resolution,
+ specify the fully-qualified hostname.
+
diff -Nru zfs-linux-2.1.11/debian/patches/series 
zfs-linux-2.1.11/debian/patches/series
--- zfs-linux-2.1.11/debian/patches/series      2023-04-19 13:37:42.000000000 
+0800
+++ zfs-linux-2.1.11/debian/patches/series      2024-11-02 15:34:23.000000000 
+0800
@@ -27,3 +27,12 @@
 0006-rootdelay-on-zfs-should-be-adaptive.patch
 0009-zdb-zero-pad-checksum-output.patch
 0010-zdb-zero-pad-checksum-output-follow-up.patch
+# 2.1.11+deb12u1
+0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
+0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
+0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
+0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
+0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
+0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
+0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
+0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch

Bug#1086617: bookworm-pu: package zfs-linux/2.1.11-1+deb12u1

Reply via email to