Am 2021-10-31 12:44, schrieb Mark Kettenis:
Date: Sun, 31 Oct 2021 11:43:38 +0100
From: Michael Walle <mich...@walle.cc>
Hi,
I sometimes see a corrupted initrd during kernel boot on my board
(kontron_sl28). Debugging showed that in this case the spin table
for the secondary CPUs overlaps with the lmb initrd allocations
(initrd_high is set).
I had a look at how the fdt and initrd are relocated (if enabled). In
summary, they use lmb_alloc() which will first add all available
memory
and then carve out reserved regions. There are calls for arch
(arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
interesting thing is that, there is also code to carve out any
reserved regions which were added to the fdt earlier
(boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT
fixups,
which might add the reserved regions are called just before jumping to
linux. Thus both the allocation for the fdt (if fdt_high is set) and
the ramdisk (if initrd_high is set) will ignore any reserved memory
regions.
Unfortunately, I don't see any good way to fix this. You'd need all
the
DT fixups before we can initialize the lmb. Also, I don't know if this
will affect any other areas; probably I'm the only one, who reserves
an
area which is outside of the u-boot code and data segment.
A hackish way would be to carve out the spin_table code in
board_lmb_reserve(). But meh..
The spin table is embedded in the u-boot binary itself isn't it? But
the memory occupied by the u-boot should already be reserved...
Yes it is. As long as it doen't overlap with the 64k EFI code page.
See below.
Unless CONFIG_EFI_LOADER is defined. Then it relocates the spin table
to memory allocated using efi_allocate_pages(). But that function
only looks at the EFI memory map to figure out what memory is
available. So I suspect that it might hand out the same memory as
lmb_alloc(). It all looks a bit broken to me...
Yes, that is actually my code ;) The kontron_sl28 is the only
board which uses spin tables as far as I know. It doesn't support
PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
it uses spin tables. The relocation code work arounds a problem
with the reserved EFI code, see [1].
And yes, it actually is broken. But so might be every code which
is using the efi_allocate_pages(), no? LMB isn't global, but is
just initialized at different places. Like before a linux kernel
is booted or when you load a file (?). And everytime the whole
memory is added, and then different regions are carved out (see
above).
Does your target end up with CONFIG_EFI_LOADER defined?
Yes ;)
-michael
[1]
https://lore.kernel.org/u-boot/20200601195336.3237-1-mich...@walle.cc/