Chung-Ju Wu <jasonw...@gmail.com> writes: > On 10/2/13 1:31 AM, Richard Sandiford wrote: >> Chung-Ju Wu <jasonw...@gmail.com> writes: >>> + /* Use $r15, if the value is NOT in the range of Is20, >>> + we must output "sethi + ori" directly since >>> + we may already passed the split stage. */ >>> + return "sethi\t%0, hi20(%1)\;ori\t%0, %0, lo12(%1)"; >>> + case 17: >>> + return "#"; >> >> I don't really understand the comment for case 16. Returning "#" >> (like for case 17) forces a split even at the output stage. >> >> In this case it might not be worth forcing a split though, so I don't >> see any need to change the code. I think the comment should be changed >> to give a different reason though. >> > > Sorry for the misleading comment. > > For case 17, we were trying to split large constant into two individual > rtx patterns into "sethi" + "addi" so that we can have chance to match > "addi" pattern with 16-bit instruction. > > But case 16 is different. > This case is only produced at prologue/epilogue phase, using a temporary > register $r15 to hold a large constant for adjusting stack pointer. > Since prologue/epilogue is after split1/split2 phase, we can only > output "sethi" + "ori" directly. > (The "addi" instruction with $r15 is a 32-bit instruction.)
But this code is in the output template of the define_insn. That code is only executed during final, after all passes have been run. If the template returns "#", final will split the instruction itself, which is possible even at that late stage. "#" doesn't have any effect on the passes themselves. (FWIW, there's also a split3 pass that runs after prologue/epilogue generation but before sched2.) However, ISTR there is/was a rule that prologue instructions shouldn't be split, since they'd lose their RTX_FRAME_RELATED_P bit or something. Maybe you hit an ICE because of that? Another way to handle this would be to have the movsi expander split large constant moves. When can_create_pseudo_p (), the intermediate results can be stored in new registers, otherwise they should reuse operands[0]. Two advantages to doing it that way are that high parts can be shared before RA, and that calls to emit_move_insn from the prologue code will split the move automatically. I think many ports do it that way (including MIPS FWIW). Thanks, Richard