On Thu, 31 Jan 2008, Andrea Arcangeli wrote: > On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: > > How we offload that? Before the scan of the rmaps we do not have the > > mmstruct. So we'd need another notifier_rmap_callback. > > My assumption is that that "int lock" exists just because > unmap_mapping_range_vma exists. If I'm right then my suggestion was to > move the invalidate_range after dropping the i_mmap_lock and not to > invoke it inside zap_page_range.
There is still no pointer to the mm_struct available there because pages of a mapping may belong to multiple processes. So we need to add another rmap method? The same issue is also occurring for unmap_hugepages(). > There's no reason why KVM should take any risk of corrupting memory > due to a single missing mmu notifier, with not taking the > refcount. get_user_pages will take it for us, so we have to pay the > atomic-op anyway. It sure worth doing the atomic_dec inside the mmu > notifier, and not immediately like this: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. > get_user_pages(pages) > __free_page(pages[0]) > > The idea is that what works for GRU, works for KVM too. So we do a > single invalidate_page and clustered invalidate_pages, we add that, > and then we make sure all places are covered so GRU will not > kernel-crash, and KVM won't risk to run oom or to generate _userland_ > corruption. Hmmmm.. Could we go to a scheme where we do not have to increase the page count? Modifications of the page struct require dirtying a cache line and it seems that we do not need an increased page count if we have an invalidate_range_start() that clears all the external references and stops the establishment of new ones and invalidate_range_end() that reenables new external references? Then we do not need the frequent invalidate_page() calls. The typical case would be anyways that invalidate_all() is called before anything else on exit. Invalidate_all() would remove all pages and disable creation of new references to the memory in the mm_struct. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/