> On 26 Nov 2017, at 12:38, Simon Glass <s...@chromium.org> wrote: > > Hi Philipp, > > On 25 November 2017 at 16:31, Dr. Philipp Tomsich > <philipp.toms...@theobroma-systems.com > <mailto:philipp.toms...@theobroma-systems.com>> wrote: >> Hi, >> >>> On 25 Nov 2017, at 23:34, Simon Glass <s...@chromium.org> wrote: >>> >>> +Tom, Masahiro, Philipp >>> >>> Hi, >>> >>> On 22 November 2017 at 03:27, Wolfgang Denk <w...@denx.de> wrote: >>>> Dear Kever Yang, >>>> >>>> In message <fd0bb500-80c4-f317-cc18-f7aaf1344...@rock-chips.com> you wrote: >>>>> >>>>> I can understand this feature, we always do dram_init_banks() first, >>>>> then we relocate to 'known' area, then will be no risk to access memory. >>>>> I believe there must be some historical reason for some kind of device, >>>>> the relocate feature is a wonderful idea for it. >>>> >>>> This is actuallyu not so much a feature needed to support some >>>> specific device (in this case much simpler approahces would be >>>> possible), but to support a whole set of features. Unfortunately >>>> these appear to get forgotten / ignored over time. >>>> >>>>> many other SoCs should be similar. >>>>> - Without relocate we can save many step, some of our customer really >>>>> care much about the boot time duration. >>>>> * no need to relocate everything >>>>> * no need to copy all the code >>>>> * no need init the driver more than once >>>> >>>> Please have a look at the README, section "Memory Management". >>>> The reloaction is not done to any _fixed_ address, but the address >>>> is actually computed at runtime, depending on a number features >>>> enabled (at least this is how it used to be - appearently little of >>>> this is tested on a regular base, so I would not be surprised if >>>> things are broken today). >>>> >>>> The basic idea was to reserve areas of memory at the top of RAM, >>>> that would not be initialized / modified by U-Boot and Linux, not >>>> even across a reset / warm boot. >>>> >>>> This was used for exaple for: >>>> >>>> - pRAM (Protected RAM) which could be used to store all kind of data >>>> (for example, using a pramfs [Protected and Persistent RAM >>>> Filesystem]) that could be kept across reboots of the OS. >>>> >>>> - shared frame buffer / video memory. U-Boot and Linux would be able >>>> to initialize the video memory just once (in U-Boot) and then >>>> share it, maybe even across reboots. especially, this would allow >>>> for a very early splash screen that gets passed (flicker free) to >>>> Linux until some Linux GUI takes over (much more difficult today). >>>> >>>> - shared log buffer: U-Boot and Linux used to use the same syslog >>>> buffer mechanism, so you could share it between U-Boot and Linux. >>>> this allows for example to >>>> * read the Linux kernel panic messages after reset in U-Boot; this >>>> is very useful when you bring up a new system and Linux crashes >>>> before it can display the log buffer on the console >>>> * pass U-Boot POST results on to Linux, so the application code >>>> can read and process these >>>> * process the system log of the previous run (especially after a >>>> panic) in Lunux after it rebootet. >>>> >>>> etc. >>>> >>>> There are a number of such features which require to reserve room at >>>> the top of RAM, the size of which is calculatedat runtime, often >>>> depending on user settable environment data. >>>> >>>> All this cannot be done without relocation to a (dynmaically >>>> computed) target address. >>>> >>>> >>>> Yes, the code could be simpler and faster without that - but then, >>>> you cut off a number of features. >>> >>> I would be interested in seeing benchmarks showing the cost of >>> relocation in terms of boot time. Last time I did this was on Exynos 5 >>> and it was some years ago. The time was pretty small provided the >>> cache was on for the memory copies associated with relocation itself. >>> Something like 10-20ms but I don't have the numbers handy. >>> >>> I think it is useful to be able to allocate memory in board_init_f() >>> for use by U-Boot for things like the display and the malloc() region. >>> >>> Options we might consider: >>> >>> 1. Don't relocate the code and data. Thus we could avoid the copy and >>> relocation cost. This is already supported with the GD_FLG_SKIP_RELOC >>> used when U-Boot runs as an EFI app >>> >>> 2. Rather than throwing away the old malloc() region, keep it around >>> so existing allocated blocks work. Then new malloc() region would be >>> used for future allocations. We could perhaps ignore free() calls in >>> that region >>> >>> 2a. This would allow us to avoid re-init of driver model in most cases >>> I think. E.g. we could init serial and timer before relocation and >>> leave them inited after relocation. We could just init the >>> 'additional' devices not done before relocation. >>> >>> 2b. I suppose we could even extend this to SPL if we wanted to. I >>> suspect it would just be a pain though, since SPL might use memory >>> that U-Boot wants. >>> >>> 3. We could turn on the cache earlier. This removes most of the >>> boot-time penalty. Ideally this should be turned on in SPL and perhaps >>> redone in U-Boot which has more memory available. If SPL is not used, >>> we could turn on the cache before relocation. >> >> Both turning on the cache and initialising the clocking could be of benefit >> to boot-time. >> >> However, the biggest possible gain will come from utilising Falcon mode >> to skip the full U-Boot stage and directly boot into the OS from SPL. This >> assumes that the drivers involved are fully optimised, so loading up the >> OS image does not take longer than necessary. > > I'd like to see numbers on that. From my experience, loading and > running U-Boot does not take very long…
I was referring to the OS images, not to U-Boot itself. While U-Boot will less than 512KB, a typical kernel image will be a handful of MB… plus there may be a few MB of ramdisk to accompany it. >> >>> 4. Rather than the reserving memory in board_init_f() we could have it >>> call malloc() from the expanded region. We could then perhaps then >>> move this reserve/allocate code in to particular drivers or >>> subsystems, and drop a good chunk of the init sequence. We would need >>> to have a larger malloc() region than is currently the case. >>> >>> There are still some arch-specific bits in board_init_f() which make >>> these sorts of changes a bit tricky to support generically. IMO it >>> would be best to move to 'generic relocation' written in C, where all >>> archs work basically the same way, before attempting any of the above. >>> >>> Still, I can see some benefits and even some simplifications. >>> >>> Regards, >>> Simon >> > > Regards, > Simon _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot