On Tue, Dec 10, 2024 at 08:01:08PM +0300, Daniil Tatianin wrote: > I mentioned my use case in the cover letter. Basically we want to protect > QEMU's pages from being migrated and compacted by kcompactd, which it > accomplishes by modifying live page tables and spamming the process with TLB > invalidate IPIs while it does that, which kills guest performance for the > duration of the compaction operation.
Ah right, I read it initially but just now when I scanned the cover letter I missed that. My fault. > > Memory locking allows to protect a process from kcompactd page compaction > and more importantly, migration (that is taking a PTE and replacing it with > one, which is closer in memory to reduce fragmentation). (As long as > /proc/sys/vm/compact_unevictable_allowed is 0) > > For this use case we don't mind page faults as they take more or less > constant time, which we can also avoid if we wanted by preallocating guest > memory. We do, however, want PTEs to be untouched by kcompactd, which > MCL_ONFAULT accomplishes just fine without the extra memory overhead that > comes from various anonymous mappings getting write-faulted with the > currently available mem-lock=on option. > > In our case we use KVM of course, TCG was just an experiment where I noticed > anonymous memory > jump way too much. > > I don't think it's feasible in our case to look for the origin of every > anonymous mapping that grew compared to the no mem-lock case (which there's > about ~30 with default Q35 + KVM, without any extra devices), and try to > optimize it to map anonymous memory less eagerly. Would it be better then to use mem-lock=on|off|onfault? So turns it into a string to avoid the "exclusiveness" needed (meanwhile having two separate knobs for relevant things looks odd too). Thanks, -- Peter Xu