On 09.08.2023 22:54, Shawn Anastasio wrote: > On 8/1/23 8:18 AM, Jan Beulich wrote: >> On 29.07.2023 00:21, Shawn Anastasio wrote: >>> --- /dev/null >>> +++ b/xen/arch/ppc/include/asm/page.h >>> @@ -0,0 +1,178 @@ >>> +#ifndef _ASM_PPC_PAGE_H >>> +#define _ASM_PPC_PAGE_H >>> + >>> +#include <xen/types.h> >>> + >>> +#include <asm/bitops.h> >>> +#include <asm/byteorder.h> >>> + >>> +#define PDE_VALID PPC_BIT(0) >>> +#define PDE_NLB_MASK 0xfffffffffffffUL >>> +#define PDE_NLB_SHIFT 5UL >>> +#define PDE_NLS_MASK 0x1f >>> + >>> +#define PTE_VALID PPC_BIT(0) >>> +#define PTE_LEAF PPC_BIT(1) >>> +#define PTE_REFERENCE PPC_BIT(55) >>> +#define PTE_CHANGE PPC_BIT(56) >>> + >>> +/* PTE Attributes */ >>> +#define PTE_ATT_SAO PPC_BIT(59) /* Strong Access Ordering */ >>> +#define PTE_ATT_NON_IDEMPOTENT PPC_BIT(58) >>> +#define PTE_ATT_TOLERANT (PPC_BIT(58) | PPC_BIT(59)) >>> + >>> +/* PTE Encoded Access Authority*/ >>> +#define PTE_EAA_PRIVILEGED PPC_BIT(60) >>> +#define PTE_EAA_READ PPC_BIT(61) >>> +#define PTE_EAA_WRITE PPC_BIT(62) >>> +#define PTE_EAA_EXECUTE PPC_BIT(63) >>> + >>> +/* Field shifts/masks */ >>> +#define PTE_RPN_MASK 0x1fffffffffffUL >>> +#define PTE_RPN_SHIFT 12UL >>> +#define PTE_ATT_MASK 0x3UL >>> +#define PTE_ATT_SHIFT 4UL >>> +#define PTE_EAA_MASK 0xfUL >> >> Did you consider introducing just *_MASK values, utilizing MASK_INSR() >> and MASK_EXTR() instead of open-coded shifting/masking? >> > > I actually wasn't aware of MASK_INSR/MASK_EXTR when writing this. I've > just looked into it though, and I don't think using them makes the code > much cleaner. Specifically I'm looking at the implementations of > `pte_to_paddr` and `pde_to_paddr` which both need to ensure that the > returned value retains its original shift. > > For example, with pte_to_paddr, this change would be: > - return be64_to_cpu(pte.pte) & (PTE_RPN_MASK << PTE_RPN_SHIFT); > + return MASK_EXTR(be64_to_cpu(pte.pte), PTE_RPN_MASK) << PTE_RPN_SHIFT; > > In addition to updating the definitions of the *_MASK macros to include > the right-most padding zeros. > > - #define PTE_RPN_MASK 0x1fffffffffffUL > + #define PTE_RPN_MASK 0x1fffffffffff000UL
This is the important change. With that the above will be return be64_to_cpu(pte.pte) & PTE_RPN_MASK; and in cases where you want the un-shifted value you'd use MASK_EXTR(). >>> +/* >>> + * Calculate the index of the provided virtual address in the provided >>> + * page table struct >>> + */ >>> +#define pt_index(pt, va) _Generic((pt), \ >> >> Earlier on I did ask about the minimum compiler version you require for >> building the PPC hypervisor. Iirc you said about any, but here you're >> using another feature where I'm not sure how far back it is available. >> As indicated back then, it would be nice if ./README could gain >> respective information. >> > > I'm locally building with gcc 10 (shipped with Debian 11), and the Xen > CI image for ppc64le builds uses gcc 11 I believe. _Generic and the > other features I'm using are certainly available further back, but I > haven't personally tested anything earlier. If there's no objections, > I'm tempted to just codify gcc 10 as the minimum supported compiler > version and leave it up to users if they want to try testing on earlier > versions. Spelling out what you know works is good enough for starters. As you say, people can propose updating when found too high. >>> + for ( page_addr = map_start; page_addr < map_end; page_addr += >>> PAGE_SIZE ) >>> + { >>> + struct lvl2_pd *lvl2; >>> + struct lvl3_pd *lvl3; >>> + struct lvl4_pt *lvl4; >>> + pde_t *pde; >>> + pte_t *pte; >>> + >>> + /* Allocate LVL 2 PD if necessary */ >>> + pde = pt_entry(lvl1, page_addr); >>> + if ( !pde_is_valid(*pde) ) >>> + { >>> + lvl2 = lvl2_pd_pool_alloc(); >>> + *pde = paddr_to_pde(__pa(lvl2), PDE_VALID, >>> XEN_PT_ENTRIES_LOG2_LVL_2); >> >> paddr_to_pde() doesn't mask off the top bits. Are you relying on >> lvl2_pd_pool_alloc() using PIC addressing to get at the value it >> then returns? >> > > I'm not sure I understand the question here. The pointer returned by > lvl2_pd_pool_alloc() will indeed have the top address bits set in > accordance with the link address, even before paging is enabled because > of the relocation code and jump to XEN_VIRT_START in patch 2. This is > fine though, since I call __pa() to strip off these bits before passing > it to paddr_to_pde. I'm sorry, I managed to overlook the __pa(). >> Also this and the similar lines below look to be a little too long. > > Yeah, I noticed this as well, but my understanding was that lines longer > than 80 columns were permitted in cases where they don't hurt > readability. In any case, I can fix this. There's only one general exception to line limit: printk() (and alike) format strings want to live all on one line (unless there are embedded \n), to allow reasonably grep-ing for them. >>> +void radix__tlbiel_all(unsigned int action) >> >> Is the double underscore in here intentional? >> > > It matches the original Linux file from which this routine was imported > (referenced in the file's top-level comment). > > As I understand it, this is meant to indicate a private function that > shouldn't be called outside of radix-specific MMU or TLB code. As > Linux's radix support code spans multiple files, a static function > wouldn't work for that. We haven't adopted that style yet(?), but I'm also not opposed to you doing locally. Jan