On 18/05/2021 07.50, Stefan Roese wrote: > Hi Matt, > > On 17.05.21 19:32, Matt Merhar wrote: >> The assembly output of the arch_initr_trap() function differed by a >> single byte after common.h was removed from traps.c: >> >> fff49a18 <arch_initr_trap>: >> fff49a18: 94 21 ff f0 stwu r1,-16(r1) >> fff49a1c: 7c 08 02 a6 mflr r0 >> fff49a20: 90 01 00 14 stw r0,20(r1) >> -fff49a24: 80 62 00 44 lwz r3,68(r2) >> +fff49a24: 80 62 00 38 lwz r3,56(r2) >> fff49a28: 4b ff 76 19 bl fff41040 <trap_init> >> fff49a2c: 80 01 00 14 lwz r0,20(r1) >> fff49a30: 38 60 00 00 li r3,0 >> fff49a34: 38 21 00 10 addi r1,r1,16 >> fff49a38: 7c 08 03 a6 mtlr r0 >> >> This was causing a consistent hard lockup during the MMC read / loading >> of the QoriQ FMan firmware on a P2041RDB board. >> >> Re-adding the header causes identical assembly to be emitted and allows >> the firmware loading and subsequent boot to succeed. >> >> Fixes: 401d1c4f5d ("common: Drop asm/global_data.h from common header") >> Signed-off-by: Matt Merhar <mattmer...@protonmail.com> > > Did you try to investigate what exactly causes this difference in the > assembly code, when the header is not included? Re-adding this header > seems like "papering over" the real problem to me.
Running pahole on traps.o without and with that patch shows (I took P2041RDB_defconfig) struct global_data { struct bd_info * bd; /* 0 4 */ long unsigned int flags; /* 4 4 */ unsigned int baudrate; /* 8 4 */ long unsigned int cpu_clk; /* 12 4 */ long unsigned int bus_clk; /* 16 4 */ long unsigned int pci_clk; /* 20 4 */ long unsigned int mem_clk; /* 24 4 */ long unsigned int have_console; /* 28 4 */ long unsigned int env_addr; /* 32 4 */ long unsigned int env_valid; /* 36 4 */ long unsigned int env_has_init; /* 40 4 */ int env_load_prio; /* 44 4 */ long unsigned int ram_base; /* 48 4 */ /* XXX 4 bytes hole, try to pack */ phys_addr_t ram_top; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ long unsigned int relocaddr; /* 64 4 */ vs struct global_data { struct bd_info * bd; /* 0 4 */ long unsigned int flags; /* 4 4 */ unsigned int baudrate; /* 8 4 */ long unsigned int cpu_clk; /* 12 4 */ long unsigned int bus_clk; /* 16 4 */ long unsigned int pci_clk; /* 20 4 */ long unsigned int mem_clk; /* 24 4 */ long unsigned int post_log_word; /* 28 4 */ long unsigned int post_log_res; /* 32 4 */ long unsigned int post_init_f_time; /* 36 4 */ long unsigned int have_console; /* 40 4 */ long unsigned int env_addr; /* 44 4 */ long unsigned int env_valid; /* 48 4 */ long unsigned int env_has_init; /* 52 4 */ int env_load_prio; /* 56 4 */ long unsigned int ram_base; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ phys_addr_t ram_top; /* 64 8 */ long unsigned int relocaddr; /* 72 4 */ so it seems to be the #if defined(CONFIG_POST) in include/asm-generic/global_data.h which is broken or unreliable or whatever the appropriate way to describe it is. It would be nice if the linker could catch that kind of thing when collecting debug info from various TUs. Rasmus