Matthew Wilcox <wi...@infradead.org> writes: > On Tue, Oct 24, 2023 at 08:06:04PM +0530, Aneesh Kumar K.V wrote: >> ptep++; >> - pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT)); >> addr += PAGE_SIZE; >> + /* >> + * increment the pfn. >> + */ >> + pte = pfn_pte(pte_pfn(pte) + 1, pte_pgprot((pte))); > > when i looked at this, it generated shit code. did you check?
I didn't look ... <goes and looks> It's not super clear cut. There's some difference because pfn_pte() contains two extra VM_BUG_ONs. But with DEBUG_VM *off* the version using pfn_pte() generates *better* code, or at least less code, ~160 instructions vs ~200. For some reason the version using PTE_RPN_SHIFT seems to be byte swapping the pte an extra two times, each of which generates ~8 instructions. But I can't see why. I tried a few other things and couldn't come up with anything that generated better code. But I'll keep poking at it tomorrow. cheers