https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84790
--- Comment #13 from Matthias Schiffer <mschif...@universe-factory.net> --- I don't think the register used matters - changing it may hide the bug in specific instances, but it does not fix the root cause. I've now built a simpler reproducer which still seems to exhibit the same issue with your latest patch (however I've only built a baremetal GCC with your patch and looked at the generated code, I've not actually run this example on the affected platforms - I might be overlooking something. Will try to get a full toolchain build in the next days). The basic premise of the following code: In test(), the return value `ret` must be moved from v0 to a different register temporarily for calling foo(). Using the inline asm, GCC is nudged to use v1 as this temporary register. As GCC knows the contents of foo() and bar(), it assumes that the value of v1 is preserved across the call to foo(). This assumption is wrong because the gp setup code is inserted at the beginning of bar after all optimization and register allocation has already happened. As mentioned before, this setup code clobbers v1. ``` unsigned ext(void); __attribute__((noinline)) static void foo(void) { /* Do not let the optimizer remove foo and bar */ asm volatile(""); } __attribute__((noinline)) static void bar(void) { foo(); } unsigned test(void) { unsigned ret = ext(); register unsigned v1 asm("v1") = ret; asm volatile("" :: "r"(v1)); bar(); return ret; } ``` `objdump -d -r` output (built using GCC commit 05daf617ea22e1d818295ed2d037456937e23530, with "-Os -mips32r2 -mtune=24kc -mabicalls -mips16 -fpic"): ``` Disassembly of section .text: 00000000 <foo>: 0: e8a0 jrc ra 2: 6500 nop 00000004 <bar>: 4: f000 6a00 li v0,0 4: R_MIPS16_HI16 _gp_disp 8: f000 0b00 la v1,8 <bar+0x4> 8: R_MIPS16_LO16 _gp_disp c: f400 3240 sll v0,16 10: e269 addu v0,v1 12: 64c4 save 32,ra 14: 659a move gp,v0 16: d204 sw v0,16(sp) 18: 675c move v0,gp 1a: f000 9a40 lw v0,0(v0) 1a: R_MIPS16_GOT16 foo 1e: f000 4a00 addiu v0,0 1e: R_MIPS16_LO16 foo 22: ea40 jalr v0 24: 653a move t9,v0 26: 6444 restore 32,ra 28: e8a0 jrc ra 2a: 6500 nop 0000002c <test>: 2c: f000 6a00 li v0,0 2c: R_MIPS16_HI16 _gp_disp 30: f000 0b00 la v1,30 <test+0x4> 30: R_MIPS16_LO16 _gp_disp 34: f400 3240 sll v0,16 38: e269 addu v0,v1 3a: 659a move gp,v0 3c: 64e4 save 32,ra,s0 3e: 671c move s0,gp 40: d204 sw v0,16(sp) 42: f000 9840 lw v0,0(s0) 42: R_MIPS16_CALL16 ext 46: ea40 jalr v0 48: 653a move t9,v0 4a: 6762 move v1,v0 4c: f000 9800 lw s0,0(s0) 4c: R_MIPS16_GOT16 bar 50: f000 4800 addiu s0,0 50: R_MIPS16_LO16 bar 54: e840 jalr s0 56: 6538 move t9,s0 58: 6464 restore 32,ra,s0 5a: e820 jr ra 5c: 6743 move v0,v1 5e: 6500 nop ``` At 4a, the return value is moved to v1. At 5c, it is supposed to be moved back, but v1 has been clobbered in the mean time.