On Tue, 4 Apr 2023 17:30:25 +0200
Martin Matuska <[email protected]> wrote:
> So I am now a little bit confused - what is the consensus? :-)
My exmh email client made a mess of that. Let's try this again.
Rick has posted a patch. Your patch should also be incorporated to work
around other EXDEV errors, but a few lines earlier so it is protected by
the lock.
There were a couple of typos in Rick's patch (a missing keystroke;
s/ojset/objset/).
The patch (Rick's null pointer dereference fix, Rick's copy file range
patch plus your copy file range patch) builds fine on amd64 and i386.
Installing and testing it now.
A combination of all three patches is attached. It's compile tested but is
currently being installed and will be tested when install is completed.
--
Cheers,
Cy Schubert <[email protected]>
FreeBSD UNIX: <[email protected]> Web: https://FreeBSD.org
NTP: <[email protected]> Web: https://nwtime.org
e^(i*pi)+1=0
>
> On 4. 4. 2023 17:26, Rick Macklem wrote:
> > On Tue, Apr 4, 2023 at 7:38 AM Mateusz Guzik <[email protected]> wrote:
> >> CAUTION: This email originated from outside of the University of Guelph.
> >> Do not click links or open attachments unless you recognize the sender and
> >> know the content is safe. If in doubt, forward suspicious emails to
> >> [email protected]
> >>
> >>
> >> On 4/4/23, Cy Schubert <[email protected]> wrote:
> >>> In message <[email protected]>, Martin
> >>> Matuska wr
> >>> ites:
> >>>> The branch main has been updated by mm:
> >>>>
> >>>> URL:
> >>>> https://cgit.FreeBSD.org/src/commit/?id=8ee579abe09ec1fe15c588fc9a08370b
> >>>> 83b81cd6
> >>>>
> >>>> commit 8ee579abe09ec1fe15c588fc9a08370b83b81cd6
> >>>> Author: Martin Matuska <[email protected]>
> >>>> AuthorDate: 2023-04-04 11:40:41 +0000
> >>>> Commit: Martin Matuska <[email protected]>
> >>>> CommitDate: 2023-04-04 11:43:34 +0000
> >>>>
> >>>> zfs: fall back if block_cloning feature is disabled
> >>>>
> >>>> If block_cloning is disabled, or other errors from zfs_clone_range()
> >>>> return an EXDEV we should fall back to vn_generic_copy_file_range().
> >>>>
> >>>> This fixes issues when copying files on the same dataset with
> >>>> block_cloning disabled.
> >>>>
> >>>> Upstreamed as pull request to OpenZFS.
> >>>>
> >>>> Reviewed by: Mateusz Guzik <[email protected]>
> >>>> OpenZFS pull request: 14713
> >>>> ---
> >>>> .../openzfs/module/os/freebsd/zfs/zfs_vnops_os.c | 17
> >>>> ++++++++++-----
> >>>> --
> >>>> 1 file changed, 10 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>>> b/sys/c
> >>>> ontrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>>> index 97429b360a36..2cd1d27e37bc 100644
> >>>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>>> @@ -6243,13 +6243,6 @@ zfs_freebsd_copy_file_range(struct
> >>>> vop_copy_file_range
> >>>> _args *ap)
> >>>> int error;
> >>>> uint64_t len = *ap->a_lenp;
> >>>>
> >>>> - /*
> >>>> - * TODO: If offset/length is not aligned to recordsize, use
> >>>> - * vn_generic_copy_file_range() on this fragment.
> >>>> - * It would be better to do this after we lock the vnodes, but then
> >>>> we
> >>>> - * need something else than vn_generic_copy_file_range().
> >>>> - */
> >>>> -
> >>>> /* Lock both vnodes, avoiding risk of deadlock. */
> >>>> do {
> >>>> mp = NULL;
> >>>> @@ -6300,6 +6293,16 @@ unlock:
> >>>> if (mp != NULL)
> >>>> vn_finished_write(mp);
> >>>>
> >>>> + /*
> >>>> + * Fall back if block_cloning feature is disabled
> >>>> + * or other EXDEV failures from zfs_vnops.c
> >>>> + */
> >>>> + if (error == EXDEV) {
> >>>> + error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp,
> >>>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp,
> >>>> ap->a_flags
> >>>> ,
> >>>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd);
> >>>> + }
> >>>> +
> >>>> return (error);
> >>>> }
> >>>>
> >>>>
> >>> This is too late to fall back. On Rick's suggestion the following makes
> >>> the
> >>>
> >>> determination at
> >>> zfs_freebsd_copy_file_range() entry much earlier.
> >>>
> >> It's not too late, but I agree it is faster to bail out early.
> >>
> >> The proposed patch adds a condition which *differs* from the one in
> >> zfs_clone_range:
> >> if (dmu_objset_spa(inos) != dmu_objset_spa(outos)) {
> >> zfs_exit_two(inzfsvfs, outzfsvfs, FTAG);
> >> return (SET_ERROR(EXDEV));
> >> }
> >>
> >> ... meaning with the proposed patch the routine can still fail with
> >> EXDEV, making zfs_freebsd_copy_file_range also do it, which must not
> >> happen.
> > Since VOP_COPY_FILE_RANGE() is only called when invp and outvp
> > are on the same mount point, I don't think this can happen now.
> > However, there is a TO DO comment that suggests a call with invp and
> > outvp on different mount points may be in the future.
> >
> > As such, leaving Martin's patch in so that it calls
> > vn_generic_copy_file_range()
> > when zfs_clone_range() returns EXDEV seems like a good idea to me.
> >
> >> That aside the code looks rather suspicious for the case where target
> >> and source vnode are the same. iow more work is needed here.
> > Definitely needs to be tested. I'll do that later to-day.
> >
> > rick
> >
> >> As the vnode is unlocked, you *can't* safely access zfsvfs_t
> >> *outzfsvfs = ZTOZSB(outzp); in that spot in this manner -- a forced
> >> unmount at the same time can free it.
> >>
> >> iow this patch does *NOT* work.
> >>
> >> With the committed variant the situation is damage controlled enough
> >> that there is time to sort it out correctly.
> >>
> >>> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>> index d41821ff67f1..e18dcca58192 100644
> >>> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> >>> @@ -6243,6 +6243,18 @@ zfs_freebsd_copy_file_range(struct
> >>> vop_copy_file_range_args *ap)
> >>> int error;
> >>> uint64_t len = *ap->a_lenp;
> >>>
> >>> + znode_t *outzp = VTOZ(ap->a_outvp);
> >>> + zfsvfs_t *outzfsvfs = ZTOZSB(outzp);
> >>> + objset_t *outos = outzfsvfs->z_os;
> >>> +
> >>> + if (!spa_feature_is_enabled(dmu_objset_spa(outos),
> >>> + SPA_FEATURE_BLOCK_CLONING)) {
> >>> + error = vn_generic_copy_file_range(ap->a_invp, ap->a_inoffp,
> >>> + ap->a_outvp, ap->a_outoffp, ap->a_lenp, ap->a_flags,
> >>> + ap->a_incred, ap->a_outcred, ap->a_fsizetd);
> >>> + return (error);
> >>> + }
> >>> +
> >>> /*
> >>> * TODO: If offset/length is not aligned to recordsize, use
> >>> * vn_generic_copy_file_range() on this fragment.
> >>>
> >>>
> >>> Can you revert your commit and commit this, please.
> >>>
> >>>
> >>> --
> >>> Cheers,
> >>> Cy Schubert <[email protected]>
> >>> FreeBSD UNIX: <[email protected]> Web: https://FreeBSD.org
> >>> NTP: <[email protected]> Web: https://nwtime.org
> >>>
> >>> e^(i*pi)+1=0
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Mateusz Guzik <mjguzik gmail.com>
diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
index 97429b360a36..16e0176be2ff 100644
--- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
+++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
@@ -6242,6 +6242,30 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap)
struct uio io;
int error;
uint64_t len = *ap->a_lenp;
+ zfsvfs_t *outzfsvfs;
+ objset_t *outos;
+ bool done_outvp;
+
+ mp = NULL;
+ error = vn_start_write(outvp, &mp, V_WAIT);
+ if (error == 0)
+ error = vn_lock(outvp, LK_EXCLUSIVE);
+ done_outvp = true;
+ if (error == 0) {
+ outzfsvfs = ZTOZSB(VTOZ(outvp));
+ outos = outzfsvfs->z_os;
+ if (!spa_feature_is_enabled(dmu_objset_spa(outos),
+ SPA_FEATURE_BLOCK_CLONING)) {
+ VOP_UNLOCK(outvp);
+ if (mp != NULL)
+ vn_finished_write(mp);
+ error = vn_generic_copy_file_range(ap->a_invp,
+ ap->a_inoffp, ap->a_outvp, ap->a_outoffp,
+ ap->a_lenp, ap->a_flags, ap->a_incred,
+ ap->a_outcred, ap->a_fsizetd);
+ return (error);
+ }
+ }
/*
* TODO: If offset/length is not aligned to recordsize, use
@@ -6252,27 +6276,29 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap)
/* Lock both vnodes, avoiding risk of deadlock. */
do {
- mp = NULL;
- error = vn_start_write(outvp, &mp, V_WAIT);
+ if (!done_outvp) {
+ mp = NULL;
+ error = vn_start_write(outvp, &mp, V_WAIT);
+ if (error == 0)
+ error = vn_lock(outvp, LK_EXCLUSIVE);
+ }
if (error == 0) {
- error = vn_lock(outvp, LK_EXCLUSIVE);
- if (error == 0) {
- if (invp == outvp)
- break;
- error = vn_lock(invp, LK_SHARED | LK_NOWAIT);
- if (error == 0)
- break;
- VOP_UNLOCK(outvp);
- if (mp != NULL)
- vn_finished_write(mp);
- mp = NULL;
- error = vn_lock(invp, LK_SHARED);
- if (error == 0)
- VOP_UNLOCK(invp);
- }
+ if (invp == outvp)
+ break;
+ error = vn_lock(invp, LK_SHARED | LK_NOWAIT);
+ if (error == 0)
+ break;
+ VOP_UNLOCK(outvp);
+ if (mp != NULL)
+ vn_finished_write(mp);
+ mp = NULL;
+ error = vn_lock(invp, LK_SHARED);
+ if (error == 0)
+ VOP_UNLOCK(invp);
}
if (mp != NULL)
vn_finished_write(mp);
+ done_outvp = false;
} while (error == 0);
if (error != 0)
return (error);
@@ -6290,7 +6316,12 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap)
goto unlock;
error = zfs_clone_range(VTOZ(invp), ap->a_inoffp, VTOZ(outvp),
- ap->a_outoffp, &len, ap->a_fsizetd->td_ucred);
+ ap->a_outoffp, &len, ap->a_outcred);
+ if (error == EXDEV)
+ error = vn_generic_copy_file_range(ap->a_invp,
+ ap->a_inoffp, ap->a_outvp, ap->a_outoffp,
+ ap->a_lenp, ap->a_flags, ap->a_incred,
+ ap->a_outcred, ap->a_fsizetd);
*ap->a_lenp = (size_t)len;
unlock: