Hi Marek, > From: U-Boot <u-boot-boun...@lists.denx.de> On Behalf Of Marek Bykowski > Sent: jeudi 3 septembre 2020 02:01 > To: tr...@konsulko.com; s...@chromium.org; u-boot@lists.denx.de > Cc: Marek Bykowski <marek.bykow...@gmail.com> > Subject: [PATCH v1] armv8: MMU: Fix the memory map for RAM > > From: Marek Bykowski <marek.bykow...@gmail.com> > > The objective of this patch is to impose the correct attributes to the RAM > memory > for the ARM CPU, as shown in the diagram below: > > ------------------ > Non-Cached | | Read-Write > Ordered | Peripherals | Not-Executable > -----------------| > | | Read-Write > Cacheable | Data | Not-Executable > -----------------| > | | Read-Only > Cacheable | Code | Not-Executable
Code is executable I think... > -----------------| > |Non-U-Boot image| Read-Write > Cacheable | eg. efi | Executable > ------------------ > > U-Boot adheres into attributing the peripheral region/s into Read-Write, Not- > Executable but it actually fails attributing the RAM correctly. > It combines the whole RAM into Read-Write, Executable, in which the Code > should > be Read-Only, Executable and the Data - Read-Write, Non-Executable. > Also the (optional) Non-U-Boot region/s, eg. EFI, PSCI, holding the Code and > Data > need updating but it is left to the developers of the image/s to do so, if > needed. > Generally any new mapping introduced should take account of the appropriate > attributes for the Instructions and Data. > > The reason it is important is that these attrributes control how the processor > interacts with a location. Such as, if a location the ARM CPU is accessing is > Executable (translation table descriptor Execute-Never attribute bit cleared) > then > the ARM CPU fetches a number of instructions from that location, all at the > same > time. For example, Cortex-A57 can source up to 128 bits per fetch depending on > an alignment. If the CPU mispredicts to the Execute-Never region, it creates > the > memory fault but it actually never uses the instructions mispredicted. The CPU > branches away elsewhere. > > Therefore, as long as the MMU is programmed correctly these mispredictions > will > only affect the performance. But if we fail programming the MMU correctly and > if > the instruction fetch logic mispredicts to the non-instruction memory it may > eventually perturb it, eg. corrupt the FIFO, the control registers, or load > the unified > cache with the instructions the data side memory system hits into > subsequently. > > Following an application of the memory map as per-diagram above an attempt to > execute an instruction fetched from the Non-Executable memory creates an > Instruction Abort. Similarly, an attempt to Write to an address marked as > Read- > Only will result in with a Data Abort. Both aborts are labelled as Permission > Faults > and are easy to catch by the processor. If all DDR is " Not-Executable", excepted code of U-boot himself and EFI, I think that the standalone application can't be more execute except if the MMU configuration change before to execute it. See do_bootm_standalone(). PS: it is done for Linux in do_bootm_linux/ do_bootm_linux/ announce_and_cleanup..... caches ans MMU are deactivated For information I have the same issue on armV7 platform stm32mp1: speculative access on memory, used by OP-TEE, protected by firewall. I propose a other solution [1]: no more map the reserved memory region with the property "no-map", bt only for cache-cp15. It is based lmb library so it could done also in armv8 cache functions. [1] http://patchwork.ozlabs.org/project/uboot/list/?series=199486 > Signed-off-by: Marek Bykowski <marek.bykow...@gmail.com> > --- > Changes in PATCH v1: > - now it re-maps the whole RAM to the proper attributes, > - took account of other images, eg. PSCI, EFI that need a separate attention > - it has been tested on qemu arm 64 and two of my armv8 boards, one is Axxia > series of the processor, the other is so early that the name cannot be > revealed yet > > arch/arm/cpu/armv8/cache_v8.c | 103 > +++++++++++++++++++++++++++++++ > arch/arm/cpu/armv8/u-boot.lds | 39 ++++++++++-- > arch/arm/include/asm/armv8/mmu.h | 6 ++ > arch/arm/lib/sections.c | 19 ++++-- > include/asm-generic/sections.h | 9 +++ > 5 files changed, 164 insertions(+), 12 deletions(-) > > diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c > index 7c31d98a6f..4d8843d05e 100644 > --- a/arch/arm/cpu/armv8/cache_v8.c > +++ b/arch/arm/cpu/armv8/cache_v8.c > @@ -14,6 +14,7 @@ > #include <asm/cache.h> > #include <asm/system.h> > #include <asm/armv8/mmu.h> > +#include <asm/sections.h> > > DECLARE_GLOBAL_DATA_PTR; > > @@ -364,6 +365,100 @@ __weak u64 get_page_table_size(void) > return size; > } > > +__weak void force_remaping_ram(void) > +{ > + int i = 0; > + > + if (!(gd->flags & GD_FLG_RELOC)) > + return; > + > + struct mm_region mem_map_ram[] = { > + /* > + * Re-map the whole RAM to Read-Write, Non-Executable, and > + * then .text section/s to Read-Only, Executable. > + */ > + { > + .virt = (u64)gd->ram_base, > + .phys = (u64)gd->ram_base, > + .size = (u64)gd->ram_size, > + .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE | > + PTE_BLOCK_UXN > + }, > +#if IS_ENABLED(CONFIG_EFI_LOADER) > + { > + .virt = (u64)__efi_runtime_start_section, > + .phys = (u64)__efi_runtime_start_section, > + .size = (u64)(__efi_runtime_stop_section - > + __efi_runtime_start_section), > + .attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE) & > ~PTE_BLOCK_UXN > + }, > + { > + .virt = (u64)__efi_runtime_rel_start_section, > + .phys = (u64)__efi_runtime_rel_start_section, > + .size = (u64)(__efi_runtime_rel_stop_section - > + __efi_runtime_rel_start_section), > + .attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE) & > ~PTE_BLOCK_UXN > + }, > +#endif > + { > + .virt = (u64)__image_copy_start, > + .phys = (u64)__image_copy_start, > + .size = (u64)(__text_end - __image_copy_start), > + .attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE | > PTE_BLOCK_AP_RO) & > + ~PTE_BLOCK_UXN > + }, > + { > + .virt = (u64)__text_rest_start, > + .phys = (u64)__text_rest_start, > + .size = (u64)(__text_rest_end - __text_rest_start), > + .attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE | > PTE_BLOCK_AP_RO) & > + ~PTE_BLOCK_UXN > + }, > +#if IS_ENABLED(CONFIG_ARMV8_SECURE_BASE) > + { > + .virt = (u64)__secure_text_start, > + .phys = (u64)__secure_text_start, > + .size = (u64)(__secure_text_end - __secure_text_start), > + .attrs = (PTE_BLOCK_MEMTYPE(MT_NORMAL) | > + PTE_BLOCK_INNER_SHARE | > PTE_BLOCK_AP_RO) & > + ~PTE_BLOCK_UXN > + }, > +#endif > + { 0 } > + }; > + > + debug("Re-mapping RAM: Code to RO,XN=0, Data - RW,XN=1"); > + if (IS_ENABLED(CONFIG_EFI_LOADER) || > IS_ENABLED(CONFIG_ARMV8_SECURE_BASE)) > + debug(", Non-U-Boot images (eg. efi, psci) - RW,XN=0 > (unchanged)"); > + debug("\n"); > + > + for (; mem_map_ram[i].size || mem_map_ram[i].attrs; i++) { > + /* > + * MT_NORMAL - Normal Memory > + * MT_DEVICE_NGNRNE - Device Memory (we don't expect that > + * really for the RAM to happen...) > + * RO - read-only > + * RW - read-write > + * XN=0 - Executable > + * XN=1 - Non-executable > + */ > + debug("[%d]: 0x%llx-0x%llx %s%s%s\n", > + i, mem_map_ram[i].phys, mem_map_ram[i].phys + > + mem_map_ram[i].size, > + mem_map_ram[i].attrs & > PTE_BLOCK_MEMTYPE(MT_NORMAL) ? > + "MT_NORMAL" : "MT_DEVICE", > + mem_map_ram[i].attrs & PTE_BLOCK_AP_RO ? "-RO" : "- > RW", > + mem_map_ram[i].attrs & PTE_BLOCK_UXN ? > + "-XN=1" : "-XN=0"); > + add_map(&mem_map_ram[i]); > + } > +} > + > void setup_pgtables(void) > { > int i; > @@ -381,6 +476,14 @@ void setup_pgtables(void) > /* Now add all MMU table entries one after another to the table */ > for (i = 0; mem_map[i].size || mem_map[i].attrs; i++) > add_map(&mem_map[i]); > + > + /* > + * Force re-mapping RAM only if the generic linker script in use. > + * The boundaries of the regions for re-mapping are defined in > + * the generic ARM64 ld script and won't work for the custom ones. > + */ > + if (!IS_ENABLED(CONFIG_SYS_CUSTOM_LDSCRIPT)) > + force_remaping_ram(); > } > > static void setup_all_pgtables(void) > diff --git a/arch/arm/cpu/armv8/u-boot.lds b/arch/arm/cpu/armv8/u-boot.lds > index > 2554980595..8e98b143d5 100644 > --- a/arch/arm/cpu/armv8/u-boot.lds > +++ b/arch/arm/cpu/armv8/u-boot.lds > @@ -9,6 +9,7 @@ > > #include <config.h> > #include <asm/psci.h> > +#include <asm/armv8/mmu.h> > > OUTPUT_FORMAT("elf64-littleaarch64", "elf64-littleaarch64", "elf64- > littleaarch64") > OUTPUT_ARCH(aarch64) > @@ -20,25 +21,36 @@ SECTIONS > #endif > . = 0x00000000; > > - . = ALIGN(8); > + /* Align .text to the page size */ > + . = ALIGN(PAGE_SIZE); > .text : > { > *(.__image_copy_start) > CPUDIR/start.o (.text*) > + . = ALIGN(PAGE_SIZE); > + KEEP(*(.__text_end)) > } > > /* This needs to come before *(.text*) */ > - .efi_runtime : { > - __efi_runtime_start = .; > + .efi_runtime ALIGN(CONSTANT(COMMONPAGESIZE)): > + { > + KEEP(*(.__efi_runtime_start)) > + __efi_runtime_start = .; > *(.text.efi_runtime*) > *(.rodata.efi_runtime*) > *(.data.efi_runtime*) > - __efi_runtime_stop = .; > + __efi_runtime_stop = .; > + . = ALIGN(PAGE_SIZE); > + KEEP(*(.__efi_runtime_stop)) > } > > + . = ALIGN(PAGE_SIZE); > .text_rest : > { > + KEEP(*(.__text_rest_start)) > *(.text*) > + . = ALIGN(PAGE_SIZE); > + KEEP(*(.__text_rest_end)) > } > > #ifdef CONFIG_ARMV8_PSCI > @@ -54,14 +66,18 @@ SECTIONS > #define CONFIG_ARMV8_SECURE_BASE > #define __ARMV8_PSCI_STACK_IN_RAM > #endif > + . = ALIGN(PAGE_SIZE); > .secure_text CONFIG_ARMV8_SECURE_BASE : > AT(ADDR(.__secure_start) + SIZEOF(.__secure_start)) > { > + KEEP(*(.__secure_text_start)) > *(._secure.text) > . = ALIGN(8); > __secure_svc_tbl_start = .; > KEEP(*(._secure_svc_tbl_entries)) > __secure_svc_tbl_end = .; > + . = ALIGN(PAGE_SIZE); > + KEEP(*(.__secure_text_end)) > } > > .secure_data : AT(LOADADDR(.secure_text) + SIZEOF(.secure_text)) > @@ -113,15 +129,26 @@ SECTIONS > KEEP(*(SORT(.u_boot_list*))); > } > > - . = ALIGN(8); > + . = ALIGN(PAGE_SIZE); > + .efi_runtime_rel_start : > + { > + KEEP(*(.__efi_runtime_rel_start)) > + } > > - .efi_runtime_rel : { > + .efi_runtime_rel : > + { > __efi_runtime_rel_start = .; > *(.rel*.efi_runtime) > *(.rel*.efi_runtime.*) > __efi_runtime_rel_stop = .; > } > > + . = ALIGN(PAGE_SIZE); > + .efi_runtime_rel_stop : > + { > + KEEP(*(.__efi_runtime_rel_stop)) > + } > + > . = ALIGN(8); > > .image_copy_end : > diff --git a/arch/arm/include/asm/armv8/mmu.h > b/arch/arm/include/asm/armv8/mmu.h > index fc97c55114..571cc283eb 100644 > --- a/arch/arm/include/asm/armv8/mmu.h > +++ b/arch/arm/include/asm/armv8/mmu.h > @@ -59,6 +59,12 @@ > */ > #define PTE_BLOCK_MEMTYPE(x) ((x) << 2) > #define PTE_BLOCK_NS (1 << 5) > +/* > + * AP[1] bit is ignored by hardware and is > + * treated as if it was One in EL2/EL3 > + */ > +#define PTE_BLOCK_AP_RO (1 << 7) > +#define PTE_BLOCK_AP_RW (0 << 7) > #define PTE_BLOCK_NON_SHARE (0 << 8) > #define PTE_BLOCK_OUTER_SHARE (2 << 8) > #define PTE_BLOCK_INNER_SHARE (3 << 8) > diff --git a/arch/arm/lib/sections.c b/arch/arm/lib/sections.c index > 3bb2d8468c..b00d24843d 100644 > --- a/arch/arm/lib/sections.c > +++ b/arch/arm/lib/sections.c > @@ -3,6 +3,8 @@ > * Copyright 2013 Albert ARIBAUD <albert.u.b...@aribaud.net> > */ > > +#include <linux/compiler_types.h> > + > /** > * These two symbols are declared in a C file so that the linker > * uses R_ARM_RELATIVE relocation, rather than the R_ARM_ABS32 one @@ - > 18,6 +20,11 @@ > * aliasing warnings. > */ > > +char __text_end[0] __section(".__text_end"); char __text_rest_start[0] > +__section(".__text_rest_start"); char __text_rest_end[0] > +__section(".__text_rest_end"); char __secure_text_start[0] > +__section(".__secure_text_start"); > +char __secure_text_end[0] __section(".__secure_text_end"); > char __bss_start[0] __attribute__((section(".__bss_start"))); > char __bss_end[0] __attribute__((section(".__bss_end"))); > char __image_copy_start[0] __attribute__((section(".__image_copy_start"))); > @@ -26,10 +33,10 @@ char __rel_dyn_start[0] > __attribute__((section(".__rel_dyn_start"))); > char __rel_dyn_end[0] __attribute__((section(".__rel_dyn_end"))); > char __secure_start[0] __attribute__((section(".__secure_start"))); > char __secure_end[0] __attribute__((section(".__secure_end"))); > -char __secure_stack_start[0] > __attribute__((section(".__secure_stack_start"))); > -char __secure_stack_end[0] __attribute__((section(".__secure_stack_end"))); > -char __efi_runtime_start[0] __attribute__((section(".__efi_runtime_start"))); > -char __efi_runtime_stop[0] __attribute__((section(".__efi_runtime_stop"))); > -char __efi_runtime_rel_start[0] > __attribute__((section(".__efi_runtime_rel_start"))); > -char __efi_runtime_rel_stop[0] > __attribute__((section(".__efi_runtime_rel_stop"))); > +char __secure_stack_start[0] __section(".__secure_stack_start"); > +char __secure_stack_end[0] __section(".__secure_stack_end"); char > +__efi_runtime_start_section[0] __section(".__efi_runtime_start"); > +char __efi_runtime_stop_section[0] __section(".__efi_runtime_stop"); > +char __efi_runtime_rel_start_section[0] > +__section(".__efi_runtime_rel_start"); > +char __efi_runtime_rel_stop_section[0] > +__section(".__efi_runtime_rel_stop"); > char _end[0] __attribute__((section(".__end"))); > diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h > index > 0577238d60..c3dc0522ee 100644 > --- a/include/asm-generic/sections.h > +++ b/include/asm-generic/sections.h > @@ -72,6 +72,11 @@ extern void _start(void); > */ > #ifdef CONFIG_ARM > > +extern char __text_end[]; > +extern char __text_rest_start[]; > +extern char __text_rest_end[]; > +extern char __secure_text_start[]; > +extern char __secure_text_end[]; > extern char __bss_start[]; > extern char __bss_end[]; > extern char __image_copy_start[]; > @@ -79,6 +84,10 @@ extern char __image_copy_end[]; extern char > _image_binary_end[]; extern char __rel_dyn_start[]; extern char > __rel_dyn_end[]; > +extern char __efi_runtime_start_section[]; extern char > +__efi_runtime_stop_section[]; extern char > +__efi_runtime_rel_start_section[]; > +extern char __efi_runtime_rel_stop_section[]; > > #else /* don't use offsets: */ > > -- > 2.21.0.896.g6a6c0f1