> -----Original Message----- > From: Amir Goldstein <amir7...@gmail.com> > Sent: Friday, May 30, 2025 9:44 PM > To: wangtao <tao.wang...@honor.com> > Cc: sumit.sem...@linaro.org; christian.koe...@amd.com; > kra...@redhat.com; vivek.kasire...@intel.com; v...@zeniv.linux.org.uk; > brau...@kernel.org; hu...@google.com; a...@linux-foundation.org; > benjamin.gaign...@collabora.com; brian.star...@arm.com; > jstu...@google.com; tjmerc...@google.com; j...@suse.cz; > baolin.w...@linux.alibaba.com; linux-me...@vger.kernel.org; dri- > de...@lists.freedesktop.org; linaro-mm-...@lists.linaro.org; linux- > ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux- > m...@kvack.org; wangbintian(BintianWang) <bintian.w...@honor.com>; > yipengxiang <yipengxi...@honor.com>; liulu 00013167 > <liulu....@honor.com>; hanfeng 00012985 <feng....@honor.com> > Subject: Re: [PATCH v3 1/4] fs: allow cross-FS copy_file_range for memory- > backed files > > On Fri, May 30, 2025 at 12:40 PM wangtao <tao.wang...@honor.com> wrote: > > > > Memory-backed files can optimize copy performance via copy_file_range > > callbacks. Compared to mmap&read: reduces GUP (get_user_pages) > > overhead; vs sendfile/splice: eliminates one memory copy; supports > > dmabuf zero-copy implementation. > > > > Signed-off-by: wangtao <tao.wang...@honor.com> > > --- > > Hi wangtao, > > Let me rephrase my reaction gently: > Regardless of the proposed API extension, and referring to the > implementation itself - I wrote the current code and I can barely understand > the logic every time I come back to read it. > > Your changes make it too messy to be reviewed/maintained. > I have a few suggestions for simplifications (see below). > > The complication arise from the fact that normally the destination fops are > used to call the copy_range() method and this case is a deviation from this > standard, we need to make the cope simpler using a helper to choose the > fops to use. > > > fs/read_write.c | 71 +++++++++++++++++++++++++++++++++----------- > -- > > include/linux/fs.h | 2 ++ > > 2 files changed, 54 insertions(+), 19 deletions(-) > > > > diff --git a/fs/read_write.c b/fs/read_write.c index > > bb0ed26a0b3a..591c6db7b785 100644 > > --- a/fs/read_write.c > > +++ b/fs/read_write.c > > @@ -1469,6 +1469,20 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, > out_fd, > > int, in_fd, } #endif > > > > +static inline bool is_copy_memory_file_to_file(struct file *file_in, > > + struct file *file_out) { > > + return (file_in->f_op->fop_flags & FOP_MEMORY_FILE) && > > + file_in->f_op->copy_file_range && > > +file_out->f_op->write_iter; } > > + > > +static inline bool is_copy_file_to_memory_file(struct file *file_in, > > + struct file *file_out) { > > + return (file_out->f_op->fop_flags & FOP_MEMORY_FILE) && > > + file_in->f_op->read_iter && > > +file_out->f_op->copy_file_range; } > > + > > Please combine that to a helper: > const struct file_operations *memory_copy_file_ops(struct file *file_in, > struct file *file_out) which returns either file_in->f_op, file_out->f_op or > NULL. > Great suggestions! I'll update them in the new patch version soon.
> > > /* > > * Performs necessary checks before doing a file copy > > * > > @@ -1484,11 +1498,23 @@ static int generic_copy_file_checks(struct file > *file_in, loff_t pos_in, > > struct inode *inode_out = file_inode(file_out); > > uint64_t count = *req_count; > > loff_t size_in; > > + bool splice = flags & COPY_FILE_SPLICE; > > + bool has_memory_file; > > int ret; > > > > - ret = generic_file_rw_checks(file_in, file_out); > > - if (ret) > > - return ret; > > + /* Skip generic checks, allow cross-sb copies for dma-buf/tmpfs */ > > + has_memory_file = is_copy_memory_file_to_file(file_in, file_out) || > > + is_copy_file_to_memory_file(file_in, file_out); > > + if (!splice && has_memory_file) { > > + if (!(file_in->f_mode & FMODE_READ) || > > + !(file_out->f_mode & FMODE_WRITE) || > > + (file_out->f_flags & O_APPEND)) > > + return -EBADF; > > I don't like that this both duplicates code and does not check all the checks > in > generic_file_rw_checks() for the non-memory file side. > I do not currently have a suggestion how to make this less messy, more > human readable and correct. Since dmabuf files aren't regular files, we'll skip generic_file_rw_checks. Adding an essential_file_rw_checks() could reduce code duplication. > > > + } else { > > + ret = generic_file_rw_checks(file_in, file_out); > > + if (ret) > > + return ret; > > + } > > > > /* > > * We allow some filesystems to handle cross sb copy, but > > passing @@ -1500,7 +1526,7 @@ static int generic_copy_file_checks(struct > file *file_in, loff_t pos_in, > > * and several different sets of file_operations, but they all end > > up > > * using the same ->copy_file_range() function pointer. > > */ > > - if (flags & COPY_FILE_SPLICE) { > > + if (splice || has_memory_file) { > > /* cross sb splice is allowed */ > > This comment is not accurate - it should say "cross fs-type", because it skips > the test that compares the ->copy_file_range pointers, not the sb. > If you are making changes to this function, this should be clarified. > Will fix this shortly. > > } else if (file_out->f_op->copy_file_range) { > > if (file_in->f_op->copy_file_range != @@ -1581,23 > > +1607,30 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, > > * same sb using clone, but for filesystems where both clone and > > copy > > * are supported (e.g. nfs,cifs), we only call the copy method. > > */ > > - if (!splice && file_out->f_op->copy_file_range) { > > - ret = file_out->f_op->copy_file_range(file_in, pos_in, > > - file_out, pos_out, > > - len, flags); > > - } else if (!splice && file_in->f_op->remap_file_range && samesb) { > > - ret = file_in->f_op->remap_file_range(file_in, pos_in, > > - file_out, pos_out, > > - min_t(loff_t, MAX_RW_COUNT, len), > > - REMAP_FILE_CAN_SHORTEN); > > - /* fallback to splice */ > > - if (ret <= 0) > > + if (!splice) { > > + if (is_copy_memory_file_to_file(file_in, file_out)) { > > + ret = file_in->f_op->copy_file_range(file_in, > > pos_in, > > + file_out, pos_out, len, flags); > > + } else if (is_copy_file_to_memory_file(file_in, file_out)) { > > + ret = file_out->f_op->copy_file_range(file_in, > > pos_in, > > + file_out, pos_out, len, flags); > > + } else if (file_out->f_op->copy_file_range) { > > + ret = file_out->f_op->copy_file_range(file_in, > > pos_in, > > + file_out, pos_out, > > + len, flags); > > + } else if (file_in->f_op->remap_file_range && samesb) { > > + ret = file_in->f_op->remap_file_range(file_in, > > pos_in, > > + file_out, pos_out, > > + min_t(loff_t, MAX_RW_COUNT, len), > > + REMAP_FILE_CAN_SHORTEN); > > + /* fallback to splice */ > > + if (ret <= 0) > > + splice = true; > > + } else if (samesb) { > > + /* Fallback to splice for same sb copy for > > + backward compat */ > > splice = true; > > - } else if (samesb) { > > - /* Fallback to splice for same sb copy for backward compat > > */ > > - splice = true; > > + } > > This is the way-too-ugly-to-live part. > Was pretty bad before and now it is unacceptable. > way too much unneeded nesting and too hard to follow. > > First of all, please use splice label and call goto splice (before the > comment) > instead of adding another nesting level. > I would do that even if not adding memory fd copy support. > > Second, please store result of mem_fops = memory_copy_file_ops() and > start with: > + if (mem_fops) { > + ret = mem_fops->copy_file_range(file_in, pos_in, > + > file_out, pos_out, > + len, > + flags); > + } else if (file_out->f_op->copy_file_range) { ... > > and update the comment above to say that: > + * For copy to/from memory fd, we alway call the copy method of the > memory fd. > */ > > I think that makes the code not more ugly than it already is. > > Thanks, > Amir. Good advice - thanks a lot! Regards, Wangtao.