On Fri, Mar 30, 2007 at 04:40:48AM +0200, Nick Piggin wrote: > > Well it would make life easier if we got rid of ZERO_PAGE completely, > which I definitely wouldn't complain about ;)
So, what bad things (apart from my bugs in untested code) happen if we do this? We can actually go further, and probably remove the ZERO_PAGE completely (just need an extra get_user_pages flag or something for the core dumping issue). Shall I do a more complete patchset and ask Andrew to give it a run in -mm? -- ZERO_PAGE for anonymous pages seems to only be designed to help stupid programs, so remove it. This solves issues with ZERO_PAGE refcounting and NUMA un-awareness. (Actually, not quite. We should also remove all the zeromap stuff that also seems to not do much except help stupid programs). Index: linux-2.6/mm/memory.c =================================================================== --- linux-2.6.orig/mm/memory.c +++ linux-2.6/mm/memory.c @@ -1613,16 +1613,10 @@ gotten: if (unlikely(anon_vma_prepare(vma))) goto oom; - if (old_page == ZERO_PAGE(address)) { - new_page = alloc_zeroed_user_highpage(vma, address); - if (!new_page) - goto oom; - } else { - new_page = alloc_page_vma(GFP_HIGHUSER, vma, address); - if (!new_page) - goto oom; - cow_user_page(new_page, old_page, address, vma); - } + new_page = alloc_page_vma(GFP_HIGHUSER, vma, address); + if (!new_page) + goto oom; + cow_user_page(new_page, old_page, address, vma); /* * Re-check the pte - we dropped the lock @@ -2130,52 +2124,33 @@ static int do_anonymous_page(struct mm_s spinlock_t *ptl; pte_t entry; - if (write_access) { - /* Allocate our own private page. */ - pte_unmap(page_table); + /* Allocate our own private page. */ + pte_unmap(page_table); - if (unlikely(anon_vma_prepare(vma))) - goto oom; - page = alloc_zeroed_user_highpage(vma, address); - if (!page) - goto oom; + if (unlikely(anon_vma_prepare(vma))) + return VM_FAULT_OOM; + page = alloc_zeroed_user_highpage(vma, address); + if (!page) + return VM_FAULT_OOM; - entry = mk_pte(page, vma->vm_page_prot); - entry = maybe_mkwrite(pte_mkdirty(entry), vma); + entry = mk_pte(page, vma->vm_page_prot); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); - page_table = pte_offset_map_lock(mm, pmd, address, &ptl); - if (!pte_none(*page_table)) - goto release; + page_table = pte_offset_map_lock(mm, pmd, address, &ptl); + if (likely(!pte_none(*page_table))) { inc_mm_counter(mm, anon_rss); lru_cache_add_active(page); page_add_new_anon_rmap(page, vma, address); - } else { - /* Map the ZERO_PAGE - vm_page_prot is readonly */ - page = ZERO_PAGE(address); - page_cache_get(page); - entry = mk_pte(page, vma->vm_page_prot); - - ptl = pte_lockptr(mm, pmd); - spin_lock(ptl); - if (!pte_none(*page_table)) - goto release; - inc_mm_counter(mm, file_rss); - page_add_file_rmap(page); - } - - set_pte_at(mm, address, page_table, entry); + set_pte_at(mm, address, page_table, entry); - /* No need to invalidate - it was non-present before */ - update_mmu_cache(vma, address, entry); - lazy_mmu_prot_update(entry); -unlock: + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, address, entry); + lazy_mmu_prot_update(entry); + } else + page_cache_release(page); pte_unmap_unlock(page_table, ptl); + return VM_FAULT_MINOR; -release: - page_cache_release(page); - goto unlock; -oom: - return VM_FAULT_OOM; } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/