On Mon, Aug 15, 2016 at 07:22:24PM +0000, Konstantin Belousov wrote: > [snip] > @@ -254,15 +259,23 @@ loop: > if ((bp->b_vflags & BV_SCANNED) != 0) > continue; > bp->b_vflags |= BV_SCANNED; > - /* Flush indirects in order. */ > + /* > + * Flush indirects in order, if requested. > + * > + * Note that if only datasync is requested, we can > + * skip indirect blocks when softupdates are not > + * active. Otherwise we must flush them with data, > + * since dependencies prevent data block writes. > + */ > if (waitfor == MNT_WAIT && bp->b_lblkno <= -NDADDR && > - lbn_level(bp->b_lblkno) >= passes) > + (lbn_level(bp->b_lblkno) >= passes || > + ((flags & DATA_ONLY) != 0 && !DOINGSOFTDEP(vp)))) > continue; > if (bp->b_lblkno > lbn) > panic("ffs_syncvnode: syncing truncated data."); > if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) == 0) { > BO_UNLOCK(bo); > - } else if (wait != 0) { > + } else if (wait) { > if (BUF_LOCK(bp, > LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, > BO_LOCKPTR(bo)) != 0) {
Hmm, some people interpret POSIX differently than I do, but I think it is clear that XBD 3.384 Synchronized I/O Data Integrity Completion requires the indirect blocks to be written (because "all file system information required to retrieve the data is successfully transferred"). The Linux man page matches this interpretation except that an fsync of the parent directory may be required. If the whole file is being considered, then any change to indirect blocks is needed to access some data (because the file was extended, a hole was filled or snapshotted data was overwritten). For other overwrites, there is no change to indirect blocks and no conflict with fdatasync's performance objectives. > @@ -330,31 +343,59 @@ next: > * these will be done with one sync and one async pass. > */ > if (bo->bo_dirty.bv_cnt > 0) { > - /* Write the inode after sync passes to flush deps. */ > - if (wait && DOINGSOFTDEP(vp) && (flags & NO_INO_UPDT) == 0) { > - BO_UNLOCK(bo); > - ffs_update(vp, 1); > - BO_LOCK(bo); > + if ((flags & DATA_ONLY) == 0) { > + still_dirty = true; > + } else { > + /* > + * For data-only sync, dirty indirect buffers > + * are ignored. > + */ > + still_dirty = false; > + TAILQ_FOREACH(bp, &bo->bo_dirty.bv_hd, b_bobufs) { > + if (bp->b_lblkno > -NDADDR) { > + still_dirty = true; > + break; > + } > + } > } > - /* switch between sync/async. */ > - wait = !wait; > - if (wait == 1 || ++passes < NIADDR + 2) > - goto loop; > + > + if (still_dirty) { > + /* Write the inode after sync passes to flush deps. */ > + if (wait && DOINGSOFTDEP(vp) && > + (flags & NO_INO_UPDT) == 0) { > + BO_UNLOCK(bo); > + ffs_update(vp, 1); > + BO_LOCK(bo); > + } > + /* switch between sync/async. */ > + wait = !wait; > + if (wait || ++passes < NIADDR + 2) > + goto loop; > #ifdef INVARIANTS > - if (!vn_isdisk(vp, NULL)) > - vn_printf(vp, "ffs_fsync: dirty "); > + if (!vn_isdisk(vp, NULL)) > + vn_printf(vp, "ffs_fsync: dirty "); > #endif > + } > } > BO_UNLOCK(bo); > error = 0; > - if ((flags & NO_INO_UPDT) == 0) > - error = ffs_update(vp, 1); > - if (DOINGSUJ(vp)) > - softdep_journal_fsync(VTOI(vp)); > + if ((flags & DATA_ONLY) == 0) { > + if ((flags & NO_INO_UPDT) == 0) > + error = ffs_update(vp, 1); > + if (DOINGSUJ(vp)) > + softdep_journal_fsync(VTOI(vp)); > + } > return (error); > } Ideally, a changed i_size would also cause the inode to be written, but I don't know how to determine that (there is no i_flag for it). Always writing the inode would defeat the point of fdatasync. -- Jilles Tjoelker _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"