On Sat, Jan 10, 2015 at 08:16:02PM +0000, Arnd Bergmann wrote: > Regarding ARM64 in particular, I think it would be nice to investigate > how to extend the THP code to cover 64KB TLBs when running with the 4KB > page size. There is a hint bit in the page table to tell the CPU that > a set of 16 aligned pages can share one TLB, and it would be nice to > use that bit in Linux, and to make this case more common for anonymous > mappings, and possible large file based mappings.
The generic THP code assumes that huge pages are done at the pmd level, which means 2MB for arm64 with 4KB page configuration. Hugetlb allows larger ptes which may not necessarily be at the pmd level, though we haven't implemented this on arm64 and it's not transparent either. As a first step it would be nice if at least we unify the APIs between hugetlbfs and THP (set_huge_pte_at vs. set_pmd_at). I think you could do some arch-only tricks by pretending that you have a pte with 16 entries only and a dummy pmd (without a corresponding hardware page table level) that can host a "huge" page (16 consecutive ptes). But we lose the 2MB transparent huge page as I don't see mm/huge_memory.c handling huge puds. We also lose the ability of building 4 real level page tables since we use the pmd as a dummy one. But it would be a nice investigation. Maybe something simpler like getting the mm layer to prefer contiguous 64KB ranges and we do the detection in the arch set_pte_at(). -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/