On 31.07.2025 17:58, Oleksii Kurochko wrote:
> Add support for down large memory mappings ("superpages") in the RISC-V
> p2m mapping so that smaller, more precise mappings ("finer-grained entries")
> can be inserted into lower levels of the page table hierarchy.
> 
> To implement that the following is done:
> - Introduce p2m_split_superpage(): Recursively shatters a superpage into
>   smaller page table entries down to the target level, preserving original
>   permissions and attributes.
> - p2m_set_entry() updated to invoke superpage splitting when inserting
>   entries at lower levels within a superpage-mapped region.
> 
> This implementation is based on the ARM code, with modifications to the part
> that follows the BBM (break-before-make) approach, some parts are simplified
> as according to RISC-V spec:
>   It is permitted for multiple address-translation cache entries to co-exist
>   for the same address. This represents the fact that in a conventional
>   TLB hierarchy, it is possible for multiple entries to match a single
>   address if, for example, a page is upgraded to a superpage without first
>   clearing the original non-leaf PTE’s valid bit and executing an SFENCE.VMA
>   with rs1=x0, or if multiple TLBs exist in parallel at a given level of the
>   hierarchy. In this case, just as if an SFENCE.VMA is not executed between
>   a write to the memory-management tables and subsequent implicit read of the
>   same address: it is unpredictable whether the old non-leaf PTE or the new
>   leaf PTE is used, but the behavior is otherwise well defined.
> In contrast to the Arm architecture, where BBM is mandatory and failing to
> use it in some cases can lead to CPU instability, RISC-V guarantees
> stability, and the behavior remains safe — though unpredictable in terms of
> which translation will be used.
> 
> Additionally, the page table walk logic has been adjusted, as ARM uses the
> opposite number of levels compared to RISC-V.

As before, I think you mean "numbering".

> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -539,6 +539,91 @@ static void p2m_free_subtree(struct p2m_domain *p2m,
>      p2m_free_page(p2m, pg);
>  }
>  
> +static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry,
> +                                unsigned int level, unsigned int target,
> +                                const unsigned int *offsets)
> +{
> +    struct page_info *page;
> +    unsigned long i;
> +    pte_t pte, *table;
> +    bool rv = true;
> +
> +    /* Convenience aliases */
> +    mfn_t mfn = pte_get_mfn(*entry);
> +    unsigned int next_level = level - 1;
> +    unsigned int level_order = XEN_PT_LEVEL_ORDER(next_level);
> +
> +    /*
> +     * This should only be called with target != level and the entry is
> +     * a superpage.
> +     */
> +    ASSERT(level > target);
> +    ASSERT(pte_is_superpage(*entry, level));
> +
> +    page = p2m_alloc_page(p2m->domain);
> +    if ( !page )
> +    {
> +        /*
> +         * The caller is in charge to free the sub-tree.
> +         * As we didn't manage to allocate anything, just tell the
> +         * caller there is nothing to free by invalidating the PTE.
> +         */
> +        memset(entry, 0, sizeof(*entry));
> +        return false;
> +    }
> +
> +    table = __map_domain_page(page);
> +
> +    /*
> +     * We are either splitting a second level 1G page into 512 first level
> +     * 2M pages, or a first level 2M page into 512 zero level 4K pages.
> +     */

Such a comment is at risk of (silently) going stale when support for 512G
mappings is added. I wonder if it's really that informative to have here.

> +    for ( i = 0; i < XEN_PT_ENTRIES; i++ )
> +    {
> +        pte_t *new_entry = table + i;
> +
> +        /*
> +         * Use the content of the superpage entry and override
> +         * the necessary fields. So the correct permission are kept.
> +         */

It's not just permissions though? The memory type field also needs
retaining (and is being retained this way). Maybe better say "attributes"?

Jan

Reply via email to