This is an attempt at cleaning up some ARM inline assembly, namely: the MRC/MCR P15 instructions, with two objectives:
1. Getting rid of some changes that are suboptimal, including a recent workaround for kirkwood. I am referring to commits forcing -marm (arm32 instructions) on whole C files just because they use the MCR/MRC instruction in inline assembly and those instructions are not supported in Thumb-1, as well as patches disabling LTO on select files for pretty much the same reason. Those commits are the following: 62e92077a893 ("arm: support Thumb-1 with CONFIG_SYS_THUMB_BUILD") 8f9696510afc ("ARM: make LTO available") e5fc9037dd33 ("ARM: fix LTO build for some thumb-interwork cases") 410d59095a9f ("arm: kirkwood: fix freeze on boot") 2. Hopefully get smaller binaries for Arm platforms that use Thumb and/ or LTO, at the cost of marginal increases (say a few dozens of bytes at most) for the other aArm boards due to not inlining a few functions. And of course no change for non-arm architectures. There are four patches in this series: - The first one reverts a change which does solve a linker error but does not work as [1] has shown. - The second patch moves inline assembly into separate .S files - With that in place, it is possible to remove the -marm overrides as well as the C flags overrides selectively disabling LTO. This is what the third patch does. - The fourth patch is not directly related, but is trying to reduce the binary size further for platforms that compile C with -ffunction-sections -fdata-sections so that the linker can eliminate unreferenced code and data with --gc-sections. I noticed that .S files do not follow the same convention. By modifying the ENTRY, WEAK and ENDPROC macros, each assembly function is emitted in its own section (.text.<name>) like the C compiler would do if it were C code compiled with -ffunction-sections, thus allowing dead code elimination for .S too. The savings are not much though (100 or 200 bytes most of the time), and it makes a couple of boards significantly bigger (imx8mn_beacon_fspi: all +16196 data +16304; stm32* as well as imxrt1170-evk: all +2K text +2K). This needs further investigation. See [6] for details. Now for the results: - CI status can be seen at [2]. All good. - buildman output showing size differences before and after the series: $ unbuffer tools/buildman/buildman -b lto-fixes-squashed -c 2 -sS \ | ansi2html >buildman-series-squashed-1e2c64f1537-size-summary.html See [3] (arch averages) $ unbuffer tools/buildman/buildman -b lto-fixes-squashed -c 2 -sSd \ | ansi2html >buildman-series-squashed-1e2c64f1537-size-details.html See [4] (per board) - buildman output showing size differences for each patch: $ unbuffer tools/buildman/buildman -b lto-fixes -c 4 -sS \ | ansi2html >buildman-series-1e2c64f1537-size-summary.html See [5] (arch averages) $ unbuffer tools/buildman/buildman -b lto-fixes -c 4 -sSd \ | ansi2html >buildman-series-1e2c64f1537-size-details.html See [6] (per board) [1] https://lists.denx.de/pipermail/u-boot/2025-June/592682.html [2] https://source.denx.de/u-boot/custodians/u-boot-net/-/pipelines/26972 [3] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-squashed-1e2c64f1537-size-summary.html [4] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-squashed-1e2c64f1537-size-details.html [5] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-1e2c64f1537-size-summary.html [6] https://people.linaro.org/~jerome.forissier/u-boot-07-Jul-2025/buildman-series-1e2c64f1537-size-details.html Jerome Forissier (4): Revert "arm: asm/system.h: mrc and mcr need .arm if __thumb2__ is not set" arm: move inline assembly CP15 instructions to separate .S files arm: do not force -marm on some C files and allow LTO everywhere linkage: use per-function section in ENTRY, WEAK and ENDPROC arch/arm/cpu/arm926ejs/Makefile | 7 +- arch/arm/cpu/arm926ejs/cache.c | 32 ++---- arch/arm/cpu/arm926ejs/cp15.S | 46 ++++++++ arch/arm/cpu/arm926ejs/cpu.c | 13 +-- arch/arm/cpu/armv7/lowlevel_init.S | 4 - arch/arm/cpu/armv7/ls102xa/psci.S | 2 +- arch/arm/cpu/armv7/nonsec_virt.S | 16 +-- arch/arm/cpu/armv7/psci.S | 118 ++++++++++----------- arch/arm/cpu/armv7/start.S | 20 ++-- arch/arm/cpu/armv8/cache.S | 26 ----- arch/arm/cpu/armv8/psci.S | 22 ++-- arch/arm/cpu/armv8/tlb.S | 2 - arch/arm/cpu/armv8/transition.S | 8 -- arch/arm/include/asm/system.h | 36 ++----- arch/arm/lib/Makefile | 17 ++- arch/arm/lib/ashldi3.S | 8 +- arch/arm/lib/ashrdi3.S | 8 +- arch/arm/lib/bitops.S | 8 -- arch/arm/lib/cache-cp15.c | 62 ++++------- arch/arm/lib/cache.c | 7 +- arch/arm/lib/cp15.S | 92 ++++++++++++++++ arch/arm/lib/crt0.S | 4 +- arch/arm/lib/div64.S | 2 - arch/arm/lib/lib1funcs.S | 36 ++----- arch/arm/lib/lshrdi3.S | 8 +- arch/arm/lib/muldi3.S | 8 +- arch/arm/lib/relocate.S | 6 +- arch/arm/lib/semihosting.S | 2 - arch/arm/lib/setjmp.S | 6 -- arch/arm/lib/setjmp_aarch64.S | 6 -- arch/arm/lib/uldivmod.S | 2 - arch/arm/mach-imx/mx5/lowlevel_init.S | 4 +- arch/arm/mach-kirkwood/Makefile | 8 +- arch/arm/mach-kirkwood/cp15.S | 13 +++ arch/arm/mach-kirkwood/include/mach/cpu.h | 13 +-- arch/arm/mach-omap2/omap3/lowlevel_init.S | 36 +++---- arch/arm/mach-renesas/lowlevel_init_gen3.S | 6 +- arch/arm/mach-tegra/psci.S | 12 +-- arch/riscv/lib/memcpy.S | 6 +- arch/riscv/lib/memmove.S | 7 +- arch/riscv/lib/memset.S | 6 +- arch/riscv/lib/semihosting.S | 2 - arch/riscv/lib/setjmp.S | 6 -- common/Makefile | 4 - include/linux/linkage.h | 20 +++- 45 files changed, 387 insertions(+), 390 deletions(-) create mode 100644 arch/arm/cpu/arm926ejs/cp15.S create mode 100644 arch/arm/lib/cp15.S create mode 100644 arch/arm/mach-kirkwood/cp15.S -- 2.43.0 base-commit: 7027b445cc0bfb86204ecb1f1fe596f5895048d9 branch: lto-fixes