Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com> writes: > "Aneesh Kumar K.V" <aneesh.ku...@linux.vnet.ibm.com> writes: > >> To support memory keys, we moved the hash pte slot information to the second >> half of the page table. This was ok with PTE entries at level 4 and level 3. >> We already allocate larger page table pages at those level to accomodate >> extra >> details. For level 4 we already have the extra space which was used to track >> 4k hash page table entry details and at pmd level the extra space was >> allocated >> to track the THP details. >> >> With hugetlbfs PTE, we used this extra space at the PMD level to store the >> slot details. But we also support hugetlbfs PTE at PUD leve and PUD level >> page >> didn't allocate extra space. This resulted in memory corruption. >> >> Fix this by allocating extra space at PUD level when HUGETLB is enabled. We >> may need further changes to allocate larger space at PMD level when we enable >> HUGETLB. That will be done in next patch. >> >> Fixes:bf9a95f9a6481bc6e(" powerpc: Free up four 64K PTE bits in 64K backed >> HPTE pages") >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com> > > Another fix, I still get random memory corruption with hugetlb test with > 16G hugepage config.
Another one. I am not sure whether we really want this in this form. But with this tests are running fine. -aneesh commit 658fe8c310a913e69e5bc9a40d4c28a3b88d5c08 Author: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com> Date: Sat Feb 10 13:17:34 2018 +0530 powerpc/mm/hash64: memset the pagetable pages on allocation. Now that we are using second half of the table to store slot details and we don't clear them in the huge_pte_get_and_clear, we need to make sure we zero out the range on allocation. This done some extra work because the first half of the table is cleared by huge_pte_get_and_clear and memset in this patch zero-out the full table page. We need to do this for pgd and pud because both get allocated from the same slab cache. Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com> --- The other option is to get huget_pte_get_and_clear to clear the second half of the page table. That requires generic changes, because we don't have hugetlb page size available there. diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index 53df86d3cfce..adb7fba4b6c7 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -73,10 +73,13 @@ static inline void radix__pgd_free(struct mm_struct *mm, pgd_t *pgd) static inline pgd_t *pgd_alloc(struct mm_struct *mm) { + pgd_t *pgd; if (radix_enabled()) return radix__pgd_alloc(mm); - return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), - pgtable_gfp_flags(mm, GFP_KERNEL)); + pgd = kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), + pgtable_gfp_flags(mm, GFP_KERNEL)); + memset(pgd, 0, PGD_TABLE_SIZE); + return pgd; } static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd) @@ -93,8 +96,11 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud) static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr) { - return kmem_cache_alloc(PGT_CACHE(PUD_CACHE_INDEX), - pgtable_gfp_flags(mm, GFP_KERNEL)); + pud_t *pud; + pud = kmem_cache_alloc(PGT_CACHE(PUD_CACHE_INDEX), + pgtable_gfp_flags(mm, GFP_KERNEL)); + memset(pud, 0, PUD_TABLE_SIZE); + return pud; } static inline void pud_free(struct mm_struct *mm, pud_t *pud)