On Thu, 21 Jan 2016 15:57:32 +0100 Gerald Schaefer <gerald.schae...@de.ibm.com> 
wrote:

> > --- a/fs/proc/task_mmu.c~numa-fix-proc-pid-numa_maps-on-s390-fix
> > +++ a/fs/proc/task_mmu.c
> > @@ -1523,6 +1523,7 @@ static int gather_pte_stats(pmd_t *pmd,
> >     pte_t *pte;
> > 
> >     ptl = pmd_trans_huge_lock(pmd, vma);
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >     if (ptl) {
> >             pte_t huge_pte = huge_ptep_get((pte_t *)pmd);
> >             struct page *page;
> > @@ -1534,6 +1535,7 @@ static int gather_pte_stats(pmd_t *pmd,
> >             spin_unlock(ptl);
> >             return 0;
> >     }
> > +#endif
> > 
> >     if (pmd_trans_unstable(pmd))
> >             return 0;
> 
> Hi Andrew,
> 
> Unfortunately this seems to be a lot more complicated than we thought.
> huge_ptep_get() is only defined when CONFIG_HUGETLB_PAGE=y, independent
> from CONFIG_TRANSPARENT_HUGEPAGE. This is because asm/hugetlb.h is only
> included from linux/hugetlb.h when CONFIG_HUGETLB_PAGE=y.
> 
> So this fix won't fix the build error when CONFIG_HUGETLBFS=n and
> CONFIG_TRANSPARENT_HUGEPAGE=y.
> 
> Since the THP code did not repeat the flaws of the hugtelbfs code, i.e.
> it is actually working on PMD entries and not PTE entries, there was
> no need for huge_ptep_get() for THP so far.
> 
> Now it seems that the THP code in gather_pte_stats() is an exception to
> this, as it is not working on a PMD like the rest of the THP code, but
> also on a fake "PTE" like the hugetlbfs code.
> 
> I guess this needs more thinking, two options are crossing my mind:
> - Fix the THP code in gather_pte_stats() to properly use a PMD instead of
>   PTE. This would probably require something like a "_pmd" version of
>   "can_gather_numa_stats()" and a pmd_dirty() check for the
>   gather_stats() parameter.
> - Make huge_ptep_get() also available for CONFIG_HUGETLBFS=n, perhaps
>   by introducing something like HAVE_ARCH_HUGE_PTEP_GET and implementing
>   the default NOP version in linux/hugetlbfs.h instead of the individual
>   asm/hugetlbfs.h files for all archs.
> 
> The first option seems more correct, but it might entail other problems.
> The second option would also introduce new problems on s390, where the
> implementation of huge_ptep_get() in arch/s390/mm/hugetlbpage.c is currently
> only built with CONFIG_HUGETLBFS=y, but I guess we could handle that.
> 
> Any thoughts / more ideas?
> 

The first option does of course sound better.

But you need numa_maps fixed, presumably in 4.5 and possibly backported
into -stable? (The changelog doesn't describe the end-user-visible
effects of the bug.  Naughty changelog!)

So is there some minimal thing we can do for now to get things working
properly and fix it for-real in 4.6?

Reply via email to