Ulrich Weigand wrote: > Georg-Johann Lay wrote: >> Ulrich Weigand wrote: >>> This means that problems like the one you're seeing have been hidden >>> so far. I've started looking into fixing this, but since I don't >>> have a target where this is needed, this effort never really went >>> anywhere. I'll append below a patch I did a year or so ago; just >>> as something to look at, it probably will not even apply to the >>> current sources any more. >> Sounds great that you tried to fix it! Much work below... wow. >> >>> I'd be happy to bring this up to date if you're willing to work with >>> me to get this tested on a target that needs this support ... > > Here's an updated version of the patch that build with current GCC > mainline and passes the SPU test suite with no regressions. > > Let me know if this works for you ...
Thanks, it works. However, it results in bloated code. I guess it's an IRA issue. int read_from_pgm_3 (const int __pgm * const __pgm * const addr) { return **addr; } results in bloated (-Os -mmcu=atmega8) Z = r30/r31 Y = r28/r29 is frame pointer Even if the dead store insn 30 is eliminated a frame is generated. read_from_pgm_3: push r28 ; 31 pushqi1/1 [length = 1] push r29 ; 32 pushqi1/1 [length = 1] rcall . ; 36 *addhi3_sp_R_pc2 [length = 1] in r28,__SP_L__ ; 37 *movhi_sp/2 [length = 2] in r29,__SP_H__ /* prologue: function */ /* frame size = 2 */ /* stack size = 4 */ .L__stack_usage = 4 movw r30,r24 ; 28 *movphi/1 [length = 2] lpm __tmp_reg__,Z+ ; 19 *movphi/2 [length = 6] lpm r31,Z mov r30,__tmp_reg__ std Y+2,r31 ; 30 *movphi/3 [length = 7] std Y+1,r30 lpm r24,Z+ ; 23 *movhi/2 [length = 2] lpm r25,Z+ /* epilogue start */ pop __tmp_reg__ ; 42 *addhi3_sp_R_pc2 [length = 2] pop __tmp_reg__ pop r29 ; 43 popqi [length = 1] pop r28 ; 44 popqi [length = 1] ret ; 45 return_from_epilogue [length = 1] The code could be just read_from_pgm_3: /* prologue: function */ ;; Z = r24 movw r30,r24 ; 28 *movphi/1 [length = 2] ;; Z = *Z lpm __tmp_reg__,Z+ ; 19 *movphi/2 [length = 6] lpm r31,Z mov r30,__tmp_reg__ ;; r24 = *Z++ lpm r24,Z+ ; 23 *movhi/2 [length = 2] lpm r25,Z+ /* epilogue start */ ret ; 45 return_from_epilogue [length = 1] MODE_CODE_BASE_REG_CLASS no reads reg_class_t avr_mode_code_base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED, addr_space_t address_space, RTX_CODE outer_code ATTRIBUTE_UNUSED, RTX_CODE index_code ATTRIBUTE_UNUSED) { if (ADDR_SPACE_PGM == address_space) { return POINTER_Z_REGS; } return reload_completed ? BASE_POINTER_REGS : POINTER_REGS; } even though for __pgm no base+offset addressing is available. Returning NO_REGS for __pgm results in ICE in push_relaod. I frequently see IRA doing a very bad job for small register classes like here. Maybe it's better to take it completely away from IRA and write the load as (set (reg:HI) (unspec:HI (post_inc:PHI (reg:PHI Z)))) Loading from __pgm is actually an unspec, i.e. reading two times from the same address will yield the same result. IRA gets fancy about spilling: Spilling for insn 19. Using reg 30 for reload 0 Try Assign 47(a0), cost=48000 changing reg in insn 19 changing reg in insn 25 changing reg in insn 22 Assigning 47(freq=3000) a new slot 0 Register 47 now on stack. Spilling for insn 19. Using reg 30 for reload 1 Using reg 30 for reload 0 Using reg 18 for reload 3 Spilling for insn 22. Using reg 26 for reload 0 Spilling for insn 25. Using reg 26 for reload 0 Spilling for insn 19. Using reg 30 for reload 0 Using reg 18 for reload 1 Spilling for insn 25. Spilling for insn 19. Using reg 30 for reload 0 Using reg 18 for reload 1 Spilling for insn 25. Johann > Bye, > Ulrich > > > ChangeLog: > > * doc/tm.texi.in (MODE_CODE_BASE_REG_CLASS): Add address space > argument. > (REGNO_MODE_CODE_OK_FOR_BASE_P): Likewise. > * doc/tm.texi: Regenerate. > > * config/cris/cris.h (MODE_CODE_BASE_REG_CLASS): Add address > space argument. > (REGNO_MODE_CODE_OK_FOR_BASE_P): Likewise. > * config/bfin/bfin.h (MODE_CODE_BASE_REG_CLASS): Likewise. > (REGNO_MODE_CODE_OK_FOR_BASE_P): Likewise. > > * addresses.h (base_reg_class): Add address space argument. > Pass to MODE_CODE_BASE_REG_CLASS. > (ok_for_base_p_1): Add address space argument. Pass to > REGNO_MODE_CODE_OK_FOR_BASE_P. > (regno_ok_for_base_p): Add address space argument. Pass to > ok_for_base_p_1. > > * regrename.c (scan_rtx_address): Add address space argument. > Pass address space to regno_ok_for_base_p and base_reg_class. > Update recursive calls. > (scan_rtx): Pass address space to scan_rtx_address. > (build_def_use): Likewise. > * regcprop.c (replace_oldest_value_addr): Add address space > argument. Pass to regno_ok_for_base_p and base_reg_class. > Update recursive calls. > (replace_oldest_value_mem): Pass address space to > replace_oldest_value_addr. > (copyprop_hardreg_forward_1): Likewise. > > * reload.c (find_reloads_address_1): Add address space argument. > Pass address space to base_reg_class and regno_ok_for_base_p. > Update recursive calls. > (find_reloads_address): Pass address space to base_reg_class, > regno_ok_for_base_p, and find_reloads_address_1. > (find_reloads): Pass address space to base_reg_class. > (find_reloads_subreg_address): Likewise. > > * ira-costs.c (record_reg_classes): Update calls to base_reg_class. > (ok_for_base_p_nonstrict): Add address space argument. Pass to > ok_for_base_p_1. > (record_address_regs): Add address space argument. Pass to > base_reg_class and ok_for_base_p_nonstrict. Update recursive calls. > (record_operand_costs): Pass address space to record_address_regs. > (scan_one_insn): Likewise. > > * caller-save.c (init_caller_save): Update call to base_reg_class. > * ira-conflicts.c (ira_build_conflicts): Likewise. > * reload1.c (maybe_fix_stack_asms): Likewise.