> > >>>> I mean, that would be fundamentally broken, because the fsync() would > > >>>> corrupt the file. So I assume in a sane environment, the dst could only > > >>>> have stale clean pagecache pages. And we'd have to get rid of these to > > >>>> re-read everything from file. > > >>> > > >>> In case of write back cache mode, we could still have stale dirty > > >>> pages at the destination > > >>> host and destination fsync is not the right thing to do. We need to > > >>> invalidate these pages > > >>> (Can we invalidate dirty pages resident in page cache with > > >>> POSIX_FADV_DONTNEED as > > >>> well?) man pages say, we cannot (unless i misunderstood it). > > >>> > > >> > > >> I think you'd have to fsync + POSIX_FADV_DONTNEED. But I am still > > >> confused how we could end up with dirty pagecache pages on the > > >> destination. In my opinion, there should only be clean pagecache pages > > >> -- can someone enlighten me? :) > > > > > > because of activity on the page cache pages corresponding to mmap region > > > in the past which is not synced yet or not reclaimed yet. Maybe this > > > is hypothetical > > > or not possible, happy to learn? > > > > Right, but assume the following *sane* > > > > #1 H0 starts and runs VM. > > #2 H0 migrates VM to H1. > > #3 H1 runs VM. > > #4 H1 migrates VM to H0. > > #5 H0 runs VM. > > > > We'd expect a proper fsync during #2, writing back any dirty pages to > > the memory backend. Otherwise, #3 would already be broken. Similarly, > > we'd expect a proper fsync during #4. > > > > I assume during #4 we could find clean pagecache pages that are actually > > invalid, because the underlying file was changed by H1. So we have to > > make sure to invalidate all pagecache pages (all clean). > > Yes, you mean fsync on source host before migration starts. My point > is something > like another process mmap same backend file on destination host and/or > guest/qemu > crashing abruptly.
In that case we should not start guest if we cannot invalidate all the corresponding page cache pages before starting guest i.e mmaping virtio-pmem backend file. Thank you for the discussion! Best regards, Pankaj