Matthew Wilcox <wi...@infradead.org> writes:
> On Tue, Oct 24, 2023 at 08:06:04PM +0530, Aneesh Kumar K.V wrote:
>>              ptep++;
>> -            pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
>>              addr += PAGE_SIZE;
>> +            /*
>> +             * increment the pfn.
>> +             */
>> +            pte = pfn_pte(pte_pfn(pte) + 1, pte_pgprot((pte)));
>
> when i looked at this, it generated shit code.  did you check?

I didn't look ...

<goes and looks>

It's not super clear cut. There's some difference because pfn_pte()
contains two extra VM_BUG_ONs.

But with DEBUG_VM *off* the version using pfn_pte() generates *better*
code, or at least less code, ~160 instructions vs ~200.

For some reason the version using PTE_RPN_SHIFT seems to be byte
swapping the pte an extra two times, each of which generates ~8
instructions. But I can't see why.

I tried a few other things and couldn't come up with anything that
generated better code. But I'll keep poking at it tomorrow.

cheers

Reply via email to