On Tue, Apr 29, 2025 at 7:15 PM Suren Baghdasaryan <sur...@google.com> wrote: > On Tue, Apr 29, 2025 at 8:56 AM Jann Horn <ja...@google.com> wrote: > > On Wed, Apr 23, 2025 at 12:54 AM Andrii Nakryiko > > <andrii.nakry...@gmail.com> wrote: > > > On Fri, Apr 18, 2025 at 10:50 AM Suren Baghdasaryan <sur...@google.com> > > > wrote: > > > > Utilize speculative vma lookup to find and snapshot a vma without > > > > taking mmap_lock during PROCMAP_QUERY ioctl execution. Concurrent > > > > address space modifications are detected and the lookup is retried. > > > > While we take the mmap_lock for reading during such contention, we > > > > do that momentarily only to record new mm_wr_seq counter. > > > > > > PROCMAP_QUERY is an even more obvious candidate for fully lockless > > > speculation, IMO (because it's more obvious that vma's use is > > > localized to do_procmap_query(), instead of being spread across > > > m_start/m_next and m_show as with seq_file approach). We do > > > rcu_read_lock(), mmap_lock_speculate_try_begin(), query for VMA (no > > > mmap_read_lock), use that VMA to produce (speculative) output, and > > > then validate that VMA or mm_struct didn't change with > > > mmap_lock_speculate_retry(). If it did - retry, if not - we are done. > > > No need for vma_copy and any gets/puts, no? > > > > I really strongly dislike this "fully lockless" approach because it > > means we get data races all over the place, and it gets hard to reason > > about what happens especially if we do anything other than reading > > plain data from the VMA. When reading the implementation of > > do_procmap_query(), at basically every memory read you'd have to think > > twice as hard to figure out which fields can be concurrently updated > > elsewhere and whether the subsequent sequence count recheck can > > recover from the resulting badness. > > > > Just as one example, I think get_vma_name() could (depending on > > compiler optimizations) crash with a NULL deref if the VMA's ->vm_ops > > pointer is concurrently changed to &vma_dummy_vm_ops by vma_close() > > between "if (vma->vm_ops && vma->vm_ops->name)" and > > "vma->vm_ops->name(vma)". And I think this illustrates how the "fully > > lockless" approach creates more implicit assumptions about the > > behavior of core MM code, which could be broken by future changes to > > MM code. > > Yeah, I'll need to re-evaluate such an approach after your review. I > like having get_stable_vma() to obtain a completely stable version of > the vma in a localized place and then stop worrying about possible > races. If implemented correctly, would that be enough to address your > concern, Jann?
Yes, I think a stable local snapshot of the VMA (where tricky data races are limited to the VMA snapshotting code) is a good tradeoff.