On 16/10/17 17:11, David Gibson wrote: > On Mon, Oct 16, 2017 at 04:49:17PM +1100, Alexey Kardashevskiy wrote: >> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest >> about 8.5sec to read the entire device tree. Some explanation can be >> found here: https://patchwork.ozlabs.org/patch/826124/ but mostly it is >> because the kernel traverses the tree twice and it calls "getprop" for >> each properly which is really SLOF as it searches from the linked list >> beginning every time. >> >> Since SLOF has just learned to build FDT and this takes less than 0.5sec >> for such a big guest, this makes use of the proposed client interface >> method - "fdt-fetch". >> >> If "fdt-fetch" is not available, the old method is used. >> >> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > > I like the concept, few details though.. > >> --- >> arch/powerpc/kernel/prom_init.c | 26 ++++++++++++++++++++++++++ >> 1 file changed, 26 insertions(+) >> >> diff --git a/arch/powerpc/kernel/prom_init.c >> b/arch/powerpc/kernel/prom_init.c >> index 02190e90c7ae..daa50a153737 100644 >> --- a/arch/powerpc/kernel/prom_init.c >> +++ b/arch/powerpc/kernel/prom_init.c >> @@ -2498,6 +2498,31 @@ static void __init flatten_device_tree(void) >> prom_panic("Can't allocate initial device-tree chunk\n"); >> mem_end = mem_start + room; >> >> + if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start, >> + room - sizeof(mem_reserve_map))) { >> + u32 size; >> + >> + hdr = (void *) mem_start; >> + >> + /* Fixup the boot cpuid */ >> + hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu); > > If SLOF is generating a tree it really should get this header field > right as well.
Ah, I did not realize it is just a phandle from /chosen/cpu. Will fix. > >> + /* Append the reserved map to the end of the blob */ >> + hdr->off_mem_rsvmap = hdr->totalsize; >> + size = be32_to_cpu(hdr->totalsize); >> + rsvmap = (void *) hdr + size; >> + hdr->totalsize = cpu_to_be32(size + sizeof(mem_reserve_map)); >> + memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map)); > > .. and the reserve map for that matter. I don't really understand > what you're doing here. ? Get the blob, increase the FDT size by sizeof(mem_reserve_map), fix up totalsize and off_mem_rsvmap, copy mem_reserve_map to the end of the blob (the actual order is slightly different, may be a bit confusing). Asking SLOF to reserve the space seems to be unnecessary complication of the interface - SLOF does not provide any reserved memory records. > Note also that the reserve map is required to > be 8-byte aligned, which totalsize might not be. Ah, good point. > >> + /* Store the DT address */ >> + dt_header_start = mem_start; >> + >> +#ifdef DEBUG_PROM >> + prom_printf("Fetched DTB: %d bytes to @%x\n", size, mem_start); >> +#endif >> + goto print_exit; >> + } >> + >> /* Get root of tree */ >> root = call_prom("peer", 1, 1, (phandle)0); >> if (root == (phandle)0) >> @@ -2548,6 +2573,7 @@ static void __init flatten_device_tree(void) >> /* Copy the reserve map in */ >> memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map)); >> >> +print_exit: >> #ifdef DEBUG_PROM >> { >> int i; > -- Alexey
signature.asc
Description: OpenPGP digital signature