David Gibson <da...@gibson.dropbear.id.au> writes: > On Wed, May 01, 2019 at 01:42:21PM +1000, Alexey Kardashevskiy wrote: >> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest >> about 8.5sec to fetch the entire device tree via the client interface >> as the DT is traversed twice - for strings blob and for struct blob. >> Also, "getprop" is quite slow too as SLOF stores properties in a linked >> list. >> >> However, since [1] SLOF builds flattened device tree (FDT) for another >> purpose. [2] adds a new "fdt-fetch" client interface for the OS to fetch >> the FDT. >> >> This tries the new method; if not supported, this falls back to >> the old method. >> >> There is a change in the FDT layout - the old method produced >> (reserved map, strings, structs), the new one receives only strings and >> structs from the firmware and adds the final reserved map to the end, >> so it is (fw reserved map, strings, structs, reserved map). >> This still produces the same unflattened device tree. >> >> This merges the reserved map from the firmware into the kernel's reserved >> map. At the moment SLOF generates an empty reserved map so this does not >> change the existing behaviour in regard of reservations. >> >> This supports only v17 onward as only that version provides dt_struct_size >> which works as "fdt-fetch" only produces v17 blobs. >> >> If "fdt-fetch" is not available, the old method of fetching the DT is used. >> >> [1] https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=e6fc84652c9c00 >> [2] https://git.qemu.org/?p=SLOF.git;a=commit;h=ecda95906930b80 >> >> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > > Hrm. I've gotta say I'm not terribly convinced that it's worth adding > a new interface we'll need to maintain to save 8s on a somewhat > contrived testcase.
256CPUs aren't that many anymore though. Although I guess that many PCI devices is still a little uncommon. A 4 socket POWER8 or POWER9 can easily be that large, and a small test kernel/userspace will boot in ~2.5-4 seconds. So it's possible that the device tree fetch could be surprisingly non-trivial percentage of boot time at least on some machines. -- Stewart Smith OPAL Architect, IBM.