Thanks, Mark.
On 07.09.2023 15:40, Mark Millard wrote:
On Sep 7, 2023, at 11:48, Glen Barber <g...@freebsd.org> wrote:
On Thu, Sep 07, 2023 at 11:17:22AM -0700, Mark Millard wrote:
When I next have time, should I retry based on a more recent
vintage of main that includes 969071be938c ?
Yes, please, if you can.
As stands, I rebooted that machine into my normal
enviroment, so the after-crash-with-dump-info
context is preserved. I'll presume lack of a need
to preserve that context unless I hear otherwise.
(But I'll work on this until later today.)
Even my normal environment predates the commit in
question by a few commits. So I'll end up doing a
more general round of updates overall.
Someone can let me know if there is a preference
for debug over non-debug for the next test run.
It is not unknown when some bugs disappear once debugging is enabled due
to different execution timings, but generally debug may to detect the
problem closer to its origin instead of looking on random consequences.
I am only starting to look on this report (unless Pawel or somebody beat
me on it), and don't have additional requests yet, but if you can repeat
the same with debug kernel (in-base ZFS's ZFS_DEBUG setting follows
kernel's INVARIANTS), it may give us some additional information.
Looking at "git: 969071be938c - main", the relevant
part seems to be just (white space possibly not
preserved accurately):
diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
index 9fb5aee6a023..4e4161ef1a7f 100644
--- a/sys/kern/vfs_vnops.c
+++ b/sys/kern/vfs_vnops.c
@@ -3076,12 +3076,14 @@ vn_copy_file_range(struct vnode *invp, off_t *inoffp,
struct vnode *outvp,
goto out;
/*
- * If the two vnode are for the same file system, call
+ * If the two vnodes are for the same file system type, call
* VOP_COPY_FILE_RANGE(), otherwise call vn_generic_copy_file_range()
- * which can handle copies across multiple file systems.
+ * which can handle copies across multiple file system types.
*/
*lenp = len;
- if (invp->v_mount == outvp->v_mount)
+ if (invp->v_mount == outvp->v_mount ||
+ strcmp(invp->v_mount->mnt_vfc->vfc_name,
+ outvp->v_mount->mnt_vfc->vfc_name) == 0)
error = VOP_COPY_FILE_RANGE(invp, inoffp, outvp, outoffp,
lenp, flags, incred, outcred, fsize_td);
else
That looks to call VOP_COPY_FILE_RANGE in more contexts and
vn_generic_copy_file_range in fewer.
The backtrace I reported involves: VOP_COPY_FILE_RANGE
So it appears this change is unlikely to invalidate my
test result, although failure might happen sooner if
more VOP_COPY_FILE_RANGE calls happen with the newer code.
Your logic is likely right, but if you have block cloning requests both
within and between datasets, this patch may change the pattern. Though
it is obviously not a fix for the issue. I responded to the commit
email only because it makes no difference while vfs.zfs.bclone_enabled is 0.
That in turns means that someone may come up with some
other change for me to test by the time I get around to
setting up another test. Let me know if so.
--
Alexander Motin