On Thu, 21 Jan 2016 15:57:32 +0100 Gerald Schaefer <gerald.schae...@de.ibm.com> wrote:
> > --- a/fs/proc/task_mmu.c~numa-fix-proc-pid-numa_maps-on-s390-fix > > +++ a/fs/proc/task_mmu.c > > @@ -1523,6 +1523,7 @@ static int gather_pte_stats(pmd_t *pmd, > > pte_t *pte; > > > > ptl = pmd_trans_huge_lock(pmd, vma); > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > if (ptl) { > > pte_t huge_pte = huge_ptep_get((pte_t *)pmd); > > struct page *page; > > @@ -1534,6 +1535,7 @@ static int gather_pte_stats(pmd_t *pmd, > > spin_unlock(ptl); > > return 0; > > } > > +#endif > > > > if (pmd_trans_unstable(pmd)) > > return 0; > > Hi Andrew, > > Unfortunately this seems to be a lot more complicated than we thought. > huge_ptep_get() is only defined when CONFIG_HUGETLB_PAGE=y, independent > from CONFIG_TRANSPARENT_HUGEPAGE. This is because asm/hugetlb.h is only > included from linux/hugetlb.h when CONFIG_HUGETLB_PAGE=y. > > So this fix won't fix the build error when CONFIG_HUGETLBFS=n and > CONFIG_TRANSPARENT_HUGEPAGE=y. > > Since the THP code did not repeat the flaws of the hugtelbfs code, i.e. > it is actually working on PMD entries and not PTE entries, there was > no need for huge_ptep_get() for THP so far. > > Now it seems that the THP code in gather_pte_stats() is an exception to > this, as it is not working on a PMD like the rest of the THP code, but > also on a fake "PTE" like the hugetlbfs code. > > I guess this needs more thinking, two options are crossing my mind: > - Fix the THP code in gather_pte_stats() to properly use a PMD instead of > PTE. This would probably require something like a "_pmd" version of > "can_gather_numa_stats()" and a pmd_dirty() check for the > gather_stats() parameter. > - Make huge_ptep_get() also available for CONFIG_HUGETLBFS=n, perhaps > by introducing something like HAVE_ARCH_HUGE_PTEP_GET and implementing > the default NOP version in linux/hugetlbfs.h instead of the individual > asm/hugetlbfs.h files for all archs. > > The first option seems more correct, but it might entail other problems. > The second option would also introduce new problems on s390, where the > implementation of huge_ptep_get() in arch/s390/mm/hugetlbpage.c is currently > only built with CONFIG_HUGETLBFS=y, but I guess we could handle that. > > Any thoughts / more ideas? > The first option does of course sound better. But you need numa_maps fixed, presumably in 4.5 and possibly backported into -stable? (The changelog doesn't describe the end-user-visible effects of the bug. Naughty changelog!) So is there some minimal thing we can do for now to get things working properly and fix it for-real in 4.6?