On Thu, Aug 13, 2015 at 10:13:32 +0200, Paolo Bonzini wrote: > On 12/08/2015 22:37, Emilio G. Cota wrote: > > > page_find is reading the radix tree outside all locks, so it has to > > > use the RCU primitives. It does not need RCU critical sections > > > because the PageDescs are never removed, so there is never a need > > > to wait for the end of code sections that use a PageDesc. > > > > Note that rcu_find_alloc might end up writing to the tree, see below. > > Yes, but in that case it's always called with the mmap_lock held, see > patch 7.
Oh I see. Didn't have much time to take a deep look at the patchset; the commit message got me confused. > page_find_alloc is only called by tb_alloc_page (called by tb_link_page > which takes mmap_lock), or by page_set_flags (called with mmap_lock held > by linux-user/mmap.c). > > > BTW the fact that there are no removals makes the use of RCU unnecessary. > > It only makes it not use the RCU synchronization primitives. You still > need the memory barriers. Yes. Personally I find it confusing to use the RCU macros just for the convenience that they bring in barriers we need; I'd prefer to explicitly have the barriers in the code. > > I argue however that it is better to call page_find/_alloc with a mutex > > held, > > since otherwise we'd have to add per-PageDesc locks (it's very common to > > call page_find and then update the PageDesc). > > The fields are protected by either the mmap_lock (e.g. the flags, see > page_unprotect and tb_alloc_page) or the tb_lock (e.g. the tb lists). > > The code is complicated and could definitely use more documentation, > especially for struct PageDesc, but it seems correct to me apart from > the lock inversion fixed in patch 10. OK. I have a bit of time today and tomorrow so I'll rebase my work on top of this patchset and then submit it. Thanks, Emilio