On Tue, 2016-12-06 at 07:34 +0100, Christophe LEROY wrote: > > Le 06/12/2016 à 02:18, Scott Wood a écrit : > > > > On Wed, 2016-09-21 at 10:11 +0200, Christophe Leroy wrote: > > > > > > Today there are two implementations of hugetlbpages which are managed > > > by exclusive #ifdefs: > > > * FSL_BOOKE: several directory entries points to the same single > > > hugepage > > > * BOOK3S: one upper level directory entry points to a table of hugepages > > > > > > In preparation of implementation of hugepage support on the 8xx, we > > > need a mix of the two above solutions, because the 8xx needs both cases > > > depending on the size of pages: > > > * In 4k page size mode, each PGD entry covers a 4M bytes area. It means > > > that 2 PGD entries will be necessary to cover an 8M hugepage while a > > > single PGD entry will cover 8x 512k hugepages. > > > * In 16 page size mode, each PGD entry covers a 64M bytes area. It means > > > that 8x 8M hugepages will be covered by one PGD entry and 64x 512k > > > hugepages will be covers by one PGD entry. > > > > > > This patch: > > > * removes #ifdefs in favor of if/else based on the range sizes > > > * merges the two huge_pte_alloc() functions as they are pretty similar > > > * merges the two hugetlbpage_init() functions as they are pretty similar > > [snip] > > > > > > @@ -860,16 +803,34 @@ static int __init hugetlbpage_init(void) > > > * if we have pdshift and shift value same, we don't > > > * use pgt cache for hugepd. > > > */ > > > - if (pdshift != shift) { > > > + if (pdshift > shift) { > > > pgtable_cache_add(pdshift - shift, NULL); > > > if (!PGT_CACHE(pdshift - shift)) > > > panic("hugetlbpage_init(): could not > > > create > > > " > > > "pgtable cache for %d bit > > > pagesize\n", shift); > > > } > > > +#ifdef CONFIG_PPC_FSL_BOOK3E > > > + else if (!hugepte_cache) { > > This else never triggers on book3e, because the way this function > > calculates > > pdshift is wrong for book3e (it uses PyD_SHIFT instead of > > HUGEPD_PxD_SHIFT). > > We later get OOMs because huge_pte_alloc() calculates pdshift correctly, > > tries to use hugepte_cache, and fails. > Ok, I'll check it again, I was expecting it to still work properly on > book3e, because after applying patch 3 it works properly on the 8xx.
On 8xx you probably happen to have a page size that yields "pdshift <= shift" even with the incorrect pdshift calculation, causing hugepte_cache to be allocated. The smallest hugepage size on 8xx is 512k compared to 4M on fsl- book3e. -Scott