On 01/03, Andrew Morton wrote: > > On Fri, 3 Jan 2014 20:55:47 +0100 Oleg Nesterov <o...@redhat.com> wrote: > > > get/put_page(thp_tail) paths do get_page_unless_zero(page_head) + > > compound_lock(). In theory this page_head can be already freed and > > reallocated as alloc_pages(__GFP_COMP, smaller_order). In this case > > get_page_unless_zero() can succeed right after set_page_refcounted(), > > and compound_lock() can race with the non-atomic __SetPageHead(). > > Would be useful to mention that these things are happening inside > prep_compound_opage() (yes?).
Agreed. Added "in prep_compound_opage()" into the changelog: get/put_page(thp_tail) paths do get_page_unless_zero(page_head) + compound_lock(). In theory this page_head can be already freed and reallocated as alloc_pages(__GFP_COMP, smaller_order). In this case get_page_unless_zero() can succeed right after set_page_refcounted(), and compound_lock() can race with the non-atomic __SetPageHead() in prep_compound_page(). Perhaps we should rework the thp locking (under discussion), but until then this patch moves set_page_refcounted() and adds wmb() to ensure that page->_count != 0 comes as a last change. I am not sure about other callers of set_page_refcounted(), but at first glance they look fine to me. or should I send v3? > > Perhaps we should rework the thp locking (under discussion), but > > until then this patch moves set_page_refcounted() and adds wmb() > > to ensure that page->_count != 0 comes as a last change. > > > > I am not sure about other callers of set_page_refcounted(), but at > > first glance they look fine to me. > > I don't get it. We're in prep_new_page() - this page is freshly > allocated and no other thread yet has any means by which to look it up > and start fiddling with it? Yes, but thp can access this page_head via stale pointer, tail->first_page, if it races with split_huge_page_refcount(). In this case we rely on compound_lock() to detect this race, the problem is that compound_lock() itself can race with head_page->flags manipulations. For example, __get_page_tail() roughly does: // PageTail(page) was already checked page_head = page->first_page; /* WINDOW */ get_page_unless_zero(page_head); compound_lock(page_head); recheck PageTail(page) to ensure page_head is still valid However, in the WINDOW above, split_huge_page() can split this huge page. After that its head can be freed and reallocated. Of course, I don't think it is possible to hit this race in practice, but still this looks wrong. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/