On Thu, Nov 19, 2020 at 03:59:40PM +0300, Andrey Gruzdev wrote: > Since reading UFFD events and saving paged data are performed > from the same thread, write fault latencies are sensitive to > migration stream stalls. Limiting total page saving rate is a > method to reduce amount of noticiable fault resolution latencies. > > Migration bandwidth limiting is achieved via noticing cases of > out-of-threshold write fault latencies and temporarily disabling > (strictly speaking, severely throttling) saving non-faulting pages.
Just curious: have you measured aver/max latency of wr-protected page requests, or better, even its distribution? I believe it should also be relevant to where the snapshot is stored, say, the backend disk of your tests. Is that a file on some fs? I would expect the latency should be still good if e.g. the throughput of the backend file system is decent even without a patch like this, but I might have missed something.. In all cases, it would be very nice if this patch can have the histogram or aver or max latency measured and compared before/after this patch applied. Thanks, -- Peter Xu