On Tue, Apr 29, 2025 at 10:33 PM Suren Baghdasaryan <sur...@google.com> wrote: > On Tue, Apr 29, 2025 at 11:55 AM Jann Horn <ja...@google.com> wrote: > > On Tue, Apr 29, 2025 at 8:04 PM Suren Baghdasaryan <sur...@google.com> > > wrote: > > > On Tue, Apr 29, 2025 at 10:21 AM Jann Horn <ja...@google.com> wrote: > > > > > > > > Hi! > > > > > > > > (I just noticed that I incorrectly assumed that VMAs use kfree_rcu > > > > (not SLAB_TYPESAFE_BY_RCU) when I wrote my review of this, somehow I > > > > forgot all about that...) > > > > > > Does this fact affect your previous comments? Just want to make sure > > > I'm not missing something... > > > > When I suggested using "WRITE_ONCE(vma->vm_file, NULL)" when tearing > > down a VMA, and using get_file_rcu() for the lockless lookup, I did > > not realize that you could actually also race with all the other > > places that set ->vm_file, like __mmap_new_file_vma() and so on; and I > > did not think about whether any of those code paths might leave a VMA > > with a dangling ->vm_file pointer. > > So, let me summarize my understanding and see if it's correct. > > If we copy the original vma, ensure that it hasn't changed while we > were copying (with mmap_lock_speculate_retry()) and then use > get_file_rcu(©->vm_file) I think we are guaranteed no UAF because > we are in RCU read section. At this point the only issue is that > copy->vm_file might have lost its last refcount and get_file_rcu() > would enter an infinite loop.
Yeah. (Using get_file_active() would avoid that.) > So, to avoid that we have to use the > original vma when calling get_file_rcu() Sorry - I originally said that, but I didn't think about SLAB_TYPESAFE_BY_RCU when I had that in mind. > but then we should also > ensure that vma itself does not change from under us due to > SLAB_TYPESAFE_BY_RCU used for vm_area_struct cache. If it does change > from under us we might end up accessing an invalid address if > vma->vm_file is being modified concurrently. Yeah, I think in theory we would have data races, since the file* reads in get_file_rcu() could race with all the (plain) ->vm_file pointer stores. So I guess it might actually be safer to use the copied VMA's ->vm_file for this, with get_file_active(). Though that would be abusing get_file_active() quite a bit, so brauner@ should probably look over this early and see whether he thinks that's acceptable... > > I guess maybe that means you really do need to do the lookup from the > > copied data, as you did in your patch; and that might require calling > > get_file_active() on the copied ->vm_file pointer (instead of > > get_file_rcu()), even though I think that is not really how > > get_file_active() is supposed to be used (it's supposed to be used > > when you know the original file hasn't been freed yet). Really what > > you'd want for that is basically a raw __get_file_rcu(), but that is > > static and I think Christian wouldn't want to expose more of these > > internals outside VFS... > > (In that case, all the stuff below about get_file_rcu() would be moot.) > > > > Or you could pepper WRITE_ONCE() over all the places that write > > ->vm_file, and ensure that ->vm_file is always NULLed before its > > reference is dropped... but that seems a bit more ugly to me. > > Ugh, yes. We either ensure no vma->vm_file tearing or use > __get_file_rcu() on a copy of the vma. Or we have to stabilize the vma > by locking it... Let me think about all these options. Thanks!