On Tue, Aug 01, 2023 at 07:51:20PM -0400, Sean Anderson wrote: > On 8/1/23 11:25, Tom Rini wrote: > > On Mon, Jul 31, 2023 at 06:33:27PM -0400, Sean Anderson wrote: > > > >> On boards with size restrictions, 1-2k can be a significant fraction of > >> the binary size. Add a new SPL version of SYS_MALLOC_RUNTIME_INIT and > >> enable it by default. > >> > >> Signed-off-by: Sean Anderson <sean.ander...@seco.com> > >> --- > >> > >> Kconfig | 11 +++++++++++ > >> 1 file changed, 11 insertions(+) > >> > >> diff --git a/Kconfig b/Kconfig > >> index 4b32286b69..3cb31a9346 100644 > >> --- a/Kconfig > >> +++ b/Kconfig > >> @@ -381,6 +381,17 @@ config SYS_MALLOC_RUNTIME_INIT > >> reduce the size of U-Boot by letting malloc's data reside in .bss > >> instead of .data. > >> > >> +config SPL_SYS_MALLOC_RUNTIME_INIT > >> + bool "Initialize malloc's internal data at runtime in SPL" > >> + default y > >> + depends on SPL > >> + help > >> + Initialize malloc's internal data structures at SPL runtime, rather > >> + than at compile-time. This is necessary if relocating the malloc arena > >> + from a smaller static memory to a large DDR memory. It can also reduce > >> + the size of U-Boot by letting malloc's data reside in .bss instead of > >> + .data. > >> + > >> config TOOLS_DEBUG > >> bool "Enable debug information for tools" > >> help > > > > Can you use something like grabserial (or other tooling) to quantify the > > change on a platform or two? > > I had a ZynqMP board roughly based on xilinx_zynqmp_virt_defconfig lying > around. On this board, serial is not initialized until after malloc, so > I used the internal timer and printed the times later. DEBUG_UART would > probably be better but I didn't get it working. > > diff --git a/common/board_r.c b/common/board_r.c > index d798c00a80a..d7aee85e5b1 100644 > --- a/common/board_r.c > +++ b/common/board_r.c > @@ -194,6 +194,7 @@ static int initr_barrier(void) > return 0; > } > > +static ulong malloc_begin, malloc_end; > static int initr_malloc(void) > { > ulong malloc_start; > @@ -208,8 +209,10 @@ static int initr_malloc(void) > * reserve_noncached(). > */ > malloc_start = gd->relocaddr - TOTAL_MALLOC_LEN; > + malloc_begin = timer_get_boot_us(); > mem_malloc_init((ulong)map_sysmem(malloc_start, TOTAL_MALLOC_LEN), > TOTAL_MALLOC_LEN); > + malloc_end = timer_get_boot_us(); > return 0; > } > > @@ -569,6 +572,7 @@ static int dm_announce(void) > > static int run_main_loop(void) > { > + printf("malloc_init took %luus (%lu %lu)\n", malloc_end - > malloc_begin, malloc_begin, malloc_end); > #ifdef CONFIG_SANDBOX > sandbox_main_loop_init(); > #endif > diff --git a/common/spl/spl.c b/common/spl/spl.c > index d74acec10b5..09abcc74442 100644 > --- a/common/spl/spl.c > +++ b/common/spl/spl.c > @@ -755,7 +755,9 @@ void board_init_r(gd_t *dummy1, ulong dummy2) > spl_set_bd(); > > #if defined(CONFIG_SYS_SPL_MALLOC) > + ulong malloc_begin = timer_get_boot_us(); > mem_malloc_init(SYS_SPL_MALLOC_START, CONFIG_SYS_SPL_MALLOC_SIZE); > + ulong malloc_end = timer_get_boot_us(); > gd->flags |= GD_FLG_FULL_MALLOC_INIT; > #endif > if (!(gd->flags & GD_FLG_SPL_INIT)) { > @@ -817,6 +819,7 @@ void board_init_r(gd_t *dummy1, ulong dummy2) > spl_image.boot_device = BOOT_DEVICE_NONE; > board_boot_order(spl_boot_list); > > + printf("malloc_init took %luus (%lu %lu)\n", malloc_end - > malloc_begin, malloc_begin, malloc_end); > ret = boot_from_devices(&spl_image, spl_boot_list, > ARRAY_SIZE(spl_boot_list)); > if (ret) { > > I recorded times (in us) from five boots. The value of > (SPL_)SYS_MALLOC_RUNTIME_INIT is in parentheses: > > SPL (n) U-Boot (n) SPL (y) U-Boot (y) > ======= ========== ======= ========== > 192975 47557 135027 47557 > 192940 47557 135059 47557 > 193006 47557 134722 47558 > 193015 47556 135055 47557 > 193987 47557 134790 47557 > > So SPL took 60 ms shorter (?!) and U-Boot was mostly unaffected. Not > sure how that happened. The raw values for begin/end look reasonable: > > malloc_init took 47557us (6778108 6825665) > malloc_init took 47557us (6779290 6826847) > malloc_init took 47558us (6780379 6827937) > > etc. but to be honest I don't really see how SPL can spend 190ms setting > some variables and clearing the arena.
OK. Please repeat this for v2 and assuming that once again the timing change is marginal, I think we can emphasize it's generally a wash. -- Tom
signature.asc
Description: PGP signature