On Tue, Jun 22, 2021 at 11:08 AM Paul Guo <paul...@gmail.com> wrote: > > On Thu, Jun 17, 2021 at 3:19 PM Michael Paquier <mich...@paquier.xyz> wrote: > > > > On Wed, Jun 02, 2021 at 05:02:10PM +0900, Michael Paquier wrote: > > > On Wed, Jun 02, 2021 at 06:20:30PM +1200, Thomas Munro wrote: > > > > The main thing I noticed was that Linux < 5.3 can fail with EXDEV if > > > > you cross a filesystem boundary, is that something we need to worry > > > > about there? > > > > > > Hmm. Good point. That may justify having a switch to control that. > > > > Paul, the patch set still needs some work, so I am switching it as > > waiting on author. I am pretty sure that we had better have a > > fallback implementation of copy_file_range() in src/port/, and that we > > are going to need an extra switch in pg_rewind to allow users to > > bypass copy_file_range()/EXDEV if they do a local rewind operation > > across different FSes with a kernel < 5.3. > > -- > > I did modification on the copy_file_range() patch yesterday by simply falling > back to read()+write() but I think it could be improved further. > > We may add a function to determine two file/path are copy_file_range() > capable or not (using POSIX standard statvfs():f_fsid?) - that could be used > by other copy_file_range() users although in the long run the function > is not needed. > And even having this we may still need the fallback code if needed. > > - For pg_rewind, we may just determine that ability once on src/dst pgdata, > but > since there might be soft link (tablespace/wal) in pgdata so we should still > allow fallback for those non copy_fie_range() capable file copying. > - Also it seems that sometimes copy_file_range() could return > ENOTSUP/EOPNOTSUP > (the file system does not support that and the kernel does not fall > back to simple copying?) > although this is not documented and it seems not usual? > > Any idea?
I modified the copy_file_range() patch using the below logic: If the first call of copy_file_range() fails with errno EXDEV or ENOTSUP, pg_rewind would not use copy_file_range() in rest code, and if copy_file_range() fails we fallback to use the previous read()+write() code logic for the file.
v4-0001-Fsync-the-affected-files-directories-only-in-pg_r.patch
Description: Binary data
v4-0002-Use-copy_file_range-if-possible-for-file-copying-.patch
Description: Binary data