ppc: Implement initial Radix MMU support

Jan Beulich Thu, 10 Aug 2023 01:50:04 -0700

On 09.08.2023 22:54, Shawn Anastasio wrote:
> On 8/1/23 8:18 AM, Jan Beulich wrote:
>> On 29.07.2023 00:21, Shawn Anastasio wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/ppc/include/asm/page.h
>>> @@ -0,0 +1,178 @@
>>> +#ifndef _ASM_PPC_PAGE_H
>>> +#define _ASM_PPC_PAGE_H
>>> +
>>> +#include <xen/types.h>
>>> +
>>> +#include <asm/bitops.h>
>>> +#include <asm/byteorder.h>
>>> +
>>> +#define PDE_VALID     PPC_BIT(0)
>>> +#define PDE_NLB_MASK  0xfffffffffffffUL
>>> +#define PDE_NLB_SHIFT 5UL
>>> +#define PDE_NLS_MASK  0x1f
>>> +
>>> +#define PTE_VALID     PPC_BIT(0)
>>> +#define PTE_LEAF      PPC_BIT(1)
>>> +#define PTE_REFERENCE PPC_BIT(55)
>>> +#define PTE_CHANGE    PPC_BIT(56)
>>> +
>>> +/* PTE Attributes */
>>> +#define PTE_ATT_SAO            PPC_BIT(59) /* Strong Access Ordering */
>>> +#define PTE_ATT_NON_IDEMPOTENT PPC_BIT(58)
>>> +#define PTE_ATT_TOLERANT       (PPC_BIT(58) | PPC_BIT(59))
>>> +
>>> +/* PTE Encoded Access Authority*/
>>> +#define PTE_EAA_PRIVILEGED PPC_BIT(60)
>>> +#define PTE_EAA_READ       PPC_BIT(61)
>>> +#define PTE_EAA_WRITE      PPC_BIT(62)
>>> +#define PTE_EAA_EXECUTE    PPC_BIT(63)
>>> +
>>> +/* Field shifts/masks */
>>> +#define PTE_RPN_MASK  0x1fffffffffffUL
>>> +#define PTE_RPN_SHIFT 12UL
>>> +#define PTE_ATT_MASK  0x3UL
>>> +#define PTE_ATT_SHIFT 4UL
>>> +#define PTE_EAA_MASK  0xfUL
>>
>> Did you consider introducing just *_MASK values, utilizing MASK_INSR()
>> and MASK_EXTR() instead of open-coded shifting/masking?
>>
> 
> I actually wasn't aware of MASK_INSR/MASK_EXTR when writing this. I've
> just looked into it though, and I don't think using them makes the code
> much cleaner. Specifically I'm looking at the implementations of
> `pte_to_paddr` and `pde_to_paddr` which both need to ensure that the
> returned value retains its original shift.
> 
> For example, with pte_to_paddr, this change would be:
> -     return be64_to_cpu(pte.pte) & (PTE_RPN_MASK << PTE_RPN_SHIFT);
> +     return MASK_EXTR(be64_to_cpu(pte.pte), PTE_RPN_MASK) << PTE_RPN_SHIFT;
> 
> In addition to updating the definitions of the *_MASK macros to include
> the right-most padding zeros.
> 
> - #define PTE_RPN_MASK  0x1fffffffffffUL
> + #define PTE_RPN_MASK  0x1fffffffffff000UL


This is the important change. With that the above will be

     return be64_to_cpu(pte.pte) & PTE_RPN_MASK;

and in cases where you want the un-shifted value you'd use MASK_EXTR().

>>> +/*
>>> + * Calculate the index of the provided virtual address in the provided
>>> + * page table struct
>>> + */
>>> +#define pt_index(pt, va) _Generic((pt), \
>>
>> Earlier on I did ask about the minimum compiler version you require for
>> building the PPC hypervisor. Iirc you said about any, but here you're
>> using another feature where I'm not sure how far back it is available.
>> As indicated back then, it would be nice if ./README could gain
>> respective information.
>>
> 
> I'm locally building with gcc 10 (shipped with Debian 11), and the Xen
> CI image for ppc64le builds uses gcc 11 I believe. _Generic and the
> other features I'm using are certainly available further back, but I
> haven't personally tested anything earlier. If there's no objections,
> I'm tempted to just codify gcc 10 as the minimum supported compiler
> version and leave it up to users if they want to try testing on earlier
> versions.

Spelling out what you know works is good enough for starters. As you
say, people can propose updating when found too high.

>>> +    for ( page_addr = map_start; page_addr < map_end; page_addr += 
>>> PAGE_SIZE )
>>> +    {
>>> +        struct lvl2_pd *lvl2;
>>> +        struct lvl3_pd *lvl3;
>>> +        struct lvl4_pt *lvl4;
>>> +        pde_t *pde;
>>> +        pte_t *pte;
>>> +
>>> +        /* Allocate LVL 2 PD if necessary */
>>> +        pde = pt_entry(lvl1, page_addr);
>>> +        if ( !pde_is_valid(*pde) )
>>> +        {
>>> +            lvl2 = lvl2_pd_pool_alloc();
>>> +            *pde = paddr_to_pde(__pa(lvl2), PDE_VALID, 
>>> XEN_PT_ENTRIES_LOG2_LVL_2);
>>
>> paddr_to_pde() doesn't mask off the top bits. Are you relying on
>> lvl2_pd_pool_alloc() using PIC addressing to get at the value it
>> then returns?
>>
> 
> I'm not sure I understand the question here. The pointer returned by
> lvl2_pd_pool_alloc() will indeed have the top address bits set in
> accordance with the link address, even before paging is enabled because
> of the relocation code and jump to XEN_VIRT_START in patch 2. This is
> fine though, since I call __pa() to strip off these bits before passing
> it to paddr_to_pde.

I'm sorry, I managed to overlook the __pa().

>> Also this and the similar lines below look to be a little too long.
> 
> Yeah, I noticed this as well, but my understanding was that lines longer
> than 80 columns were permitted in cases where they don't hurt
> readability. In any case, I can fix this.

There's only one general exception to line limit: printk() (and alike)
format strings want to live all on one line (unless there are embedded
\n), to allow reasonably grep-ing for them.

>>> +void radix__tlbiel_all(unsigned int action)
>>
>> Is the double underscore in here intentional?
>>
> 
> It matches the original Linux file from which this routine was imported
> (referenced in the file's top-level comment).
> 
> As I understand it, this is meant to indicate a private function that
> shouldn't be called outside of radix-specific MMU or TLB code.  As
> Linux's radix support code spans multiple files, a static function
> wouldn't work for that.

We haven't adopted that style yet(?), but I'm also not opposed to you
doing locally.

Jan

Re: [PATCH 3/3] xen/ppc: Implement initial Radix MMU support

Reply via email to