https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902
--- Comment #7 from Ilya Leoshkevich <iii at linux dot ibm.com> --- Apparently, for this specific case doing more of hard register copy propagation is enough. I've just tried running pass_cprop_hardreg before pass_thread_prologue_and_epilogue, and it helped. So, would running a mini-cprop_hardreg instead of just copyprop_hardreg_forward_bb_without_debug_insn (entry_block) be reasonable here? Something along the lines of: - Do something like pre_and_rev_post_order_compute_fn (), but do not go further from bbs which contain insns satisfying requires_stack_frame_p (), since shrink-wrapping cannot happen past those anyway. Same for bbs which have more than 1 predecessor, since cprop_hardreg forgets everything it saw when it encounters those. Not sure if a reasonable merge function can be defined for struct value_data to improve this? Maybe also stop completely when a certain number of bbs is found. - Do something like pass_cprop_hardreg::execute (), but use only bbs computed during the previous step. Btw, would reverse postorder be the "more intelligent queuing of blocks" mentioned in pass_cprop_hardreg::execute ()? When you say that what IRA does is not effective, do you mean just the need to track indirect hard register copies, or can it be improved even further?