On Mon, Jan 9, 2023 at 9:54 PM Suren Baghdasaryan <sur...@google.com> wrote: > Introduce lock_vma_under_rcu function to lookup and lock a VMA during > page fault handling. When VMA is not found, can't be locked or changes > after being locked, the function returns NULL. The lookup is performed > under RCU protection to prevent the found VMA from being destroyed before > the VMA lock is acquired. VMA lock statistics are updated according to > the results. > For now only anonymous VMAs can be searched this way. In other cases the > function returns NULL. [...] > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > + unsigned long address) > +{ > + MA_STATE(mas, &mm->mm_mt, address, address); > + struct vm_area_struct *vma, *validate; > + > + rcu_read_lock(); > + vma = mas_walk(&mas); > +retry: > + if (!vma) > + goto inval; > + > + /* Only anonymous vmas are supported for now */ > + if (!vma_is_anonymous(vma)) > + goto inval; > + > + if (!vma_read_trylock(vma)) > + goto inval; > + > + /* Check since vm_start/vm_end might change before we lock the VMA */ > + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) { > + vma_read_unlock(vma); > + goto inval; > + } > + > + /* Check if the VMA got isolated after we found it */ > + mas.index = address; > + validate = mas_walk(&mas);
Question for Maple Tree experts: Are you allowed to use mas_walk() like this? If the first mas_walk() call encountered a single-entry tree, it would store mas->node = MAS_ROOT, right? And then the second call would go into mas_state_walk(), mas_start() would return NULL, mas_is_ptr() would be true, and then mas_state_walk() would return the result of mas_start(), which is NULL? And we'd end up with mas_walk() returning NULL on the second run even though the tree hasn't changed? > + if (validate != vma) { > + vma_read_unlock(vma); > + count_vm_vma_lock_event(VMA_LOCK_MISS); > + /* The area was replaced with another one. */ > + vma = validate; > + goto retry; > + } > + > + rcu_read_unlock(); > + return vma; > +inval: > + rcu_read_unlock(); > + count_vm_vma_lock_event(VMA_LOCK_ABORT); > + return NULL; > +}