On Fri, Apr 04 2025, Jason Gunthorpe wrote:

> On Thu, Apr 03, 2025 at 05:37:06PM +0000, Pratyush Yadav wrote:
>
[...]
>> > This should be more like:
>> >
>> > union {
>> >       void *table;
>> >       phys_addr_t table_phys;
>> > };
>> >
>> > Since we are not using the low bits right now and it is alot cheaper
>> > to convert from va to phys only once during the final step. __va is
>> > not exactly fast.
>> 
>> The descriptor is used on _every_ level of the table, not just the
>> top.
>
> Yes
>
>> So if we use virtual addresses, at serialize time we would have to walk
>> the whole table and covert all addresses to physical.
>
> Yes
>
>> And __va() does
>> not seem to be doing too much. On x86, it expands to:
>> 
>>     #define __va(x)                  ((void *)((unsigned 
>> long)(x)+PAGE_OFFSET))
>> 
>> and on ARM64 to:
>> 
>>     #define __va(x)                  ((void 
>> *)__phys_to_virt((phys_addr_t)(x)))
>>     #define __phys_to_virt(x)        ((unsigned long)((x) - PHYS_OFFSET) | 
>> PAGE_OFFSET)
>
> Hmm, I was sure sparsemem added a bunch of stuff to this path, maybe
> I'm thinking of page_to_phys

Yep, page_to_phys for sparsemem is somewhat expensive, but __va() seems
to be fine.

    #define __page_to_pfn(pg)                                   \
    ({  const struct page *__pg = (pg);                         \
            int __sec = page_to_section(__pg);                  \
            (unsigned long)(__pg - 
__section_mem_map_addr(__nr_to_section(__sec)));     \
    })

>
>> >> +struct kho_mem_track {
>> >> + /* Points to L4 KHOMEM descriptor, each order gets its own table. */
>> >> + struct xarray orders;
>> >> +};
>> >
>> > I think it would be easy to add a 5th level and just use bits 63:57 as
>> > a 6 bit order. Then you don't need all this stuff either.
>> 
>> I am guessing you mean to store the order in the table descriptor
>> itself, instead of having a different table for each order.
>
> Not quite, I mean to index the per-order sub trees by using the high
> order bits. You still end up with N seperate bitmap trees, but instead
> of using an xarray to hold their top pointers you hold them in a 5th
> level.
>
>> Though now that I think of it, it is probably much simpler to just use
>> khomem_desc_t orders[NR_PAGE_ORDERS] instead of the xarray.
>
> Which is basically this, but encoding the index to orders in the address

I think this is way easier to wrap your head around compared to trying
to encode orders in address. That doesn't even work that well since
address has pretty much no connection to order. So a page at address say
0x100000 might be any order. So we would have to encode the order into
the address, and then decode it when doing table operations. Just having
a separate table indexed separately by orders is way simpler.

-- 
Regards,
Pratyush Yadav

Reply via email to