On Wed, Apr 23, 2025 at 12:54 AM Andrii Nakryiko <andrii.nakry...@gmail.com> wrote: > On Fri, Apr 18, 2025 at 10:50 AM Suren Baghdasaryan <sur...@google.com> wrote: > > Utilize speculative vma lookup to find and snapshot a vma without > > taking mmap_lock during PROCMAP_QUERY ioctl execution. Concurrent > > address space modifications are detected and the lookup is retried. > > While we take the mmap_lock for reading during such contention, we > > do that momentarily only to record new mm_wr_seq counter. > > PROCMAP_QUERY is an even more obvious candidate for fully lockless > speculation, IMO (because it's more obvious that vma's use is > localized to do_procmap_query(), instead of being spread across > m_start/m_next and m_show as with seq_file approach). We do > rcu_read_lock(), mmap_lock_speculate_try_begin(), query for VMA (no > mmap_read_lock), use that VMA to produce (speculative) output, and > then validate that VMA or mm_struct didn't change with > mmap_lock_speculate_retry(). If it did - retry, if not - we are done. > No need for vma_copy and any gets/puts, no?
I really strongly dislike this "fully lockless" approach because it means we get data races all over the place, and it gets hard to reason about what happens especially if we do anything other than reading plain data from the VMA. When reading the implementation of do_procmap_query(), at basically every memory read you'd have to think twice as hard to figure out which fields can be concurrently updated elsewhere and whether the subsequent sequence count recheck can recover from the resulting badness. Just as one example, I think get_vma_name() could (depending on compiler optimizations) crash with a NULL deref if the VMA's ->vm_ops pointer is concurrently changed to &vma_dummy_vm_ops by vma_close() between "if (vma->vm_ops && vma->vm_ops->name)" and "vma->vm_ops->name(vma)". And I think this illustrates how the "fully lockless" approach creates more implicit assumptions about the behavior of core MM code, which could be broken by future changes to MM code.