On Thu, Dec 01, 2022 at 04:11:56PM -0500, Stefan Berger wrote: > From: Daniel Axtens <d...@axtens.net> > > On PowerVM, the first time we boot a Linux partition, we may only get > 256MB of real memory area, even if the partition has more memory. > > This isn't enough to reliably verify a kernel. Fortunately, the Power > Architecture Platform Reference (PAPR) defines a method we can call to ask > for more memory: the broad and powerful ibm,client-architecture-support > (CAS) method. > > CAS can do an enormous amount of things on a PAPR platform: as well as > asking for memory, you can set the supported processor level, the interrupt > controller, hash vs radix mmu, and so on. > > If: > > - we are running under what we think is PowerVM (compatible property of / > begins with "IBM"), and > > - the full amount of RMA is less than 512MB (as determined by the reg > property of /memory) > > then call CAS as follows: (refer to the Linux on Power Architecture > Reference, LoPAR, which is public, at B.5.2.3): > > - Use the "any" PVR value and supply 2 option vectors. > > - Set option vector 1 (PowerPC Server Processor Architecture Level) > to "ignore". > > - Set option vector 2 with default or Linux-like options, including a > min-rma-size of 512MB. > > - Set option vector 3 to request Floating Point, VMX and Decimal Floating > point, but don't abort the boot if we can't get them. > > - Set option vector 4 to request a minimum VP percentage to 1%, which is > what Linux requests, and is below the default of 10%. Without this, > some systems with very large or very small configurations fail to boot. > > This will cause a CAS reboot and the partition will restart with 512MB > of RMA. Importantly, grub will notice the 512MB and not call CAS again. > > Notes about the choices of parameters: > > - A partition can be configured with only 256MB of memory, which would > mean this request couldn't be satisfied, but PFW refuses to load with > only 256MB of memory, so it's a bit moot. SLOF will run fine with 256MB, > but we will never call CAS under qemu/SLOF because /compatible won't > begin with "IBM".) > > - unspecified CAS vectors take on default values. Some of these values > might restrict the ability of certain hardware configurations to boot. > This is why we need to specify the VP percentage in vector 4, which is > in turn why we need to specify vector 3. > > Finally, we should have enough memory to verify a kernel, and we will > reach Linux. One of the first things Linux does while still running under > OpenFirmware is to call CAS with a much fuller set of options (including > asking for 512MB of memory). Linux includes a much more restrictive set of > PVR values and processor support levels, and this CAS invocation will likely > induce another reboot. On this reboot grub will again notice the higher RMA, > and not call CAS. We will get to Linux again, Linux will call CAS again, but > because the values are now set for Linux this will not induce another CAS > reboot and we will finally boot all the way to userspace. > > On all subsequent boots, everything will be configured with 512MB of RMA, > so there will be no further CAS reboots from grub. (phyp is super sticky > with the RMA size - it persists even on cold boots. So if you've ever booted > Linux in a partition, you'll probably never have grub call CAS. It'll only > ever fire the first time a partition loads grub, or if you deliberately lower > the amount of memory your partition has below 512MB.) > > Signed-off-by: Daniel Axtens <d...@axtens.net> > Signed-off-by: Stefan Berger <stef...@linux.ibm.com> > --- > grub-core/kern/ieee1275/cmain.c | 3 + > grub-core/kern/ieee1275/init.c | 165 +++++++++++++++++++++++++++++++ > include/grub/ieee1275/ieee1275.h | 8 ++ > 3 files changed, 176 insertions(+) > > diff --git a/grub-core/kern/ieee1275/cmain.c b/grub-core/kern/ieee1275/cmain.c > index 4442b6a83..b707798ec 100644 > --- a/grub-core/kern/ieee1275/cmain.c > +++ b/grub-core/kern/ieee1275/cmain.c > @@ -123,6 +123,9 @@ grub_ieee1275_find_options (void) > break; > } > } > +
#if defined(__powerpc__) ? > + if (grub_strncmp (tmp, "IBM,", 4) == 0) > + grub_ieee1275_set_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY); #endif > } > > if (is_smartfirmware) > diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c > index 2adf4fdfc..0bc571e3e 100644 > --- a/grub-core/kern/ieee1275/init.c > +++ b/grub-core/kern/ieee1275/init.c > @@ -200,11 +200,176 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, > grub_memory_type_t type, > return 0; > } > > +/* > + * How much memory does OF believe it has? (regardless of whether > + * it's accessible or not) > + */ > +static grub_err_t > +grub_ieee1275_total_mem (grub_uint64_t *total) > +{ > + grub_ieee1275_phandle_t root; > + grub_ieee1275_phandle_t memory; > + grub_uint32_t reg[4]; > + grub_ssize_t reg_size; > + grub_uint32_t address_cells = 1; > + grub_uint32_t size_cells = 1; > + grub_uint64_t size; > + > + /* If we fail to get to the end, report 0. */ > + *total = 0; > + > + /* Determine the format of each entry in `reg'. */ > + if (grub_ieee1275_finddevice ("/", &root)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't find / node"); > + if (grub_ieee1275_get_integer_property (root, "#address-cells", > &address_cells, > + sizeof (address_cells), 0)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine > #address-cells"); > + if (grub_ieee1275_get_integer_property (root, "#size-cells", &size_cells, > + sizeof (size_cells), 0)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine > #size-cells"); > + > + if (size_cells > address_cells) > + address_cells = size_cells; > + > + /* Load `/memory/reg'. */ > + if (grub_ieee1275_finddevice ("/memory", &memory)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't find /memory > node"); > + if (grub_ieee1275_get_integer_property (memory, "reg", reg, > + sizeof (reg), ®_size)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine > /memory/reg property"); > + if (reg_size < 0 || (grub_size_t) reg_size > sizeof (reg)) > + return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "/memory response buffer > exceeded"); > + > + if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_BROKEN_ADDRESS_CELLS)) > + { > + address_cells = 1; > + size_cells = 1; > + } > + > + /* Decode only the size */ > + size = reg[address_cells]; > + if (size_cells == 2) > + size = (size << 32) | reg[address_cells + 1]; > + > + *total = size; > + > + return grub_errno; > +} > + > +#if defined(__powerpc__) > + > +/* See PAPR or arch/powerpc/kernel/prom_init.c */ > +struct option_vector2 > +{ > + grub_uint8_t byte1; > + grub_uint16_t reserved; > + grub_uint32_t real_base; > + grub_uint32_t real_size; > + grub_uint32_t virt_base; > + grub_uint32_t virt_size; > + grub_uint32_t load_base; > + grub_uint32_t min_rma; > + grub_uint32_t min_load; > + grub_uint8_t min_rma_percent; > + grub_uint8_t max_pft_size; > +} GRUB_PACKED; > + > +struct pvr_entry > +{ > + grub_uint32_t mask; > + grub_uint32_t entry; > +}; > + > +struct cas_vector > +{ > + struct > + { > + struct pvr_entry terminal; > + } pvr_list; > + grub_uint8_t num_vecs; > + grub_uint8_t vec1_size; > + grub_uint8_t vec1; > + grub_uint8_t vec2_size; > + struct option_vector2 vec2; > + grub_uint8_t vec3_size; > + grub_uint16_t vec3; > + grub_uint8_t vec4_size; > + grub_uint16_t vec4; > +} GRUB_PACKED; > + > +/* > + * Call ibm,client-architecture-support to try to get more RMA. > + * We ask for 512MB which should be enough to verify a distro kernel. > + * We ignore most errors: if we don't succeed we'll proceed with whatever > + * memory we have. > + */ > +static void > +grub_ieee1275_ibm_cas (void) > +{ > + int rc; > + grub_ieee1275_ihandle_t root; > + struct cas_args > + { > + struct grub_ieee1275_common_hdr common; > + grub_ieee1275_cell_t method; > + grub_ieee1275_ihandle_t ihandle; > + grub_ieee1275_cell_t cas_addr; > + grub_ieee1275_cell_t result; > + } args; > + struct cas_vector vector = > + { > + .pvr_list = { { 0x00000000, 0xffffffff } }, /* any processor */ > + .num_vecs = 4 - 1, > + .vec1_size = 0, > + .vec1 = 0x80, /* ignore */ > + .vec2_size = 1 + sizeof (struct option_vector2) - 2, > + .vec2 = { > + 0, 0, -1, -1, -1, -1, -1, 512, -1, 0, 48 > + }, > + .vec3_size = 2 - 1, > + .vec3 = 0x00e0, /* ask for FP + VMX + DFP but don't halt if unsatisfied > */ > + .vec4_size = 2 - 1, > + .vec4 = 0x0001, /* set required minimum capacity % to the lowest value */ > + }; > + > + INIT_IEEE1275_COMMON (&args.common, "call-method", 3, 2); > + args.method = (grub_ieee1275_cell_t) "ibm,client-architecture-support"; > + rc = grub_ieee1275_open ("/", &root); > + if (rc) > + { > + grub_error (GRUB_ERR_IO, "could not open root when trying to call > CAS"); > + return; > + } > + args.ihandle = root; > + args.cas_addr = (grub_ieee1275_cell_t) &vector; > + > + grub_printf ("Calling ibm,client-architecture-support from grub..."); > + IEEE1275_CALL_ENTRY_FN (&args); > + grub_printf ("done\n"); > + > + grub_ieee1275_close (root); > +} > + > +#endif /* __powerpc__ */ > + > static void > grub_claim_heap (void) > { > unsigned long total = 0; > > +#if defined(__powerpc__) > + if (grub_ieee1275_test_flag > (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY)) > + { > + grub_uint64_t rma_size; > + grub_err_t err; > + > + err = grub_ieee1275_total_mem (&rma_size); > + /* if we have an error, don't call CAS, just hope for the best */ > + if (err == GRUB_ERR_NONE && rma_size < (512 * 1024 * 1024)) > + grub_ieee1275_ibm_cas (); > + } > +#endif > + > grub_machine_mmap_iterate (heap_init, &total); > } > #endif > diff --git a/include/grub/ieee1275/ieee1275.h > b/include/grub/ieee1275/ieee1275.h > index f53228703..b5c916d1d 100644 > --- a/include/grub/ieee1275/ieee1275.h > +++ b/include/grub/ieee1275/ieee1275.h > @@ -128,6 +128,14 @@ enum grub_ieee1275_flag > GRUB_IEEE1275_FLAG_CURSORONOFF_ANSI_BROKEN, > > GRUB_IEEE1275_FLAG_RAW_DEVNAMES, > + > + /* > + * On PFW, the first time we boot a Linux partition, we may only get 256MB > of > + * real memory area, even if the partition has more memory. Set this flag > if > + * we think we're running under PFW. Then, if this flag is set, and the > RMA is > + * only 256MB in size, try asking for more with CAS. > + */ #if defined(__powerpc__) ? > + GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY, #endif Otherwise Reviewed-by: Daniel Kiper <daniel.ki...@oracle.com>... Daniel _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel