On Sep 23, 2015, Alexandre Oliva <aol...@redhat.com> wrote: > @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all > *all, [snip] > + if (GET_CODE (reg) != CONCAT) > + stack_parm = reg; > + else > + /* This will use or allocate a stack slot that we'd rather > + avoid. FIXME: Could we avoid it in more cases? */ > + target_reg = reg;
It turns out that we can, and that helps fixing PR67753. In the end, I ended up using the ABI-reserved stack slot if there is one, but just allocating an unsplit complex pseudo fixes all remaining cases that used to require the allocation of a stack slot. Yay! As for pr67753 proper, we emitted the store of the PARALLEL entry_parm into the stack parm only in the conversion seq, which was ultimately emitted after the copy from stack_parm to target_reg that was supposed to copy the value originally in entry_parm. So we copied an uninitialized stack slot, and the subsequent store in the conversion seq was optimized out as dead. This caused a number of regressions on hppa-linux-gnu. The fix for this is to arrange for the copy to target_reg to be emitted in the conversion seq if the copy to stack_parm was. I can't determine whether this fix all reported regressions, but from visual inspection of the generated code I'm pretty sure it fixes at least gcc.c-torture/execute/pr38969.c. When we do NOT have an ABI-reserved stack slot, the store of the PARALLEL entry_parm into the intermediate pseudo doesn't need to go in the conversion seq (emit_group_store from a PARALLEL to a pseudo only uses registers, according to another comment in function.c), so I've simplified that case. This was regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu, ppc64el-linux-gnu, and cross-build-tested for all targets for which I've tested the earlier patches in the patchset. Ok to install? [PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg From: Alexandre Oliva <aol...@redhat.com> In assign_parms_setup_block, the copy of args in PARALLELs from entry_parm to stack_parm is deferred to the parm conversion insn seq, but the copy from stack_parm to target_reg was inserted in the normal copy seq, that is executed before the conversion insn seq. Oops. We could do away with the need for an actual stack_parm in general, which would have avoided the need for emitting the copy to target_reg in the conversion seq, but at least on pa, due to the need for stack to copy between SI and SF modes, it seems like using the reserved stack slot is beneficial, so I put in logic to use a pre-reserved stack slot when there is one, and emit the copy to target_reg in the conversion seq if stack_parm was set up there. for gcc/ChangeLog PR rtl-optimization/67753 PR rtl-optimization/64164 * function.c (assign_parm_setup_block): Avoid allocating a stack slot if we don't have an ABI-reserved one. Emit the copy to target_reg in the conversion seq if the copy from entry_parm is in it too. Don't use the conversion seq to copy a PARALLEL to a REG or a CONCAT. --- gcc/function.c | 39 ++++++++++++++++++++++++++++++++++----- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/gcc/function.c b/gcc/function.c index aaf49a4..156c72b 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -2879,6 +2879,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all, rtx entry_parm = data->entry_parm; rtx stack_parm = data->stack_parm; rtx target_reg = NULL_RTX; + bool in_conversion_seq = false; HOST_WIDE_INT size; HOST_WIDE_INT size_stored; @@ -2895,9 +2896,23 @@ assign_parm_setup_block (struct assign_parm_data_all *all, if (GET_CODE (reg) != CONCAT) stack_parm = reg; else - /* This will use or allocate a stack slot that we'd rather - avoid. FIXME: Could we avoid it in more cases? */ - target_reg = reg; + { + target_reg = reg; + /* Avoid allocating a stack slot, if there isn't one + preallocated by the ABI. It might seem like we should + always prefer a pseudo, but converting between + floating-point and integer modes goes through the stack + on various machines, so it's better to use the reserved + stack slot than to risk wasting it and allocating more + for the conversion. */ + if (stack_parm == NULL_RTX) + { + int save = generating_concat_p; + generating_concat_p = 0; + stack_parm = gen_reg_rtx (mode); + generating_concat_p = save; + } + } data->stack_parm = NULL; } @@ -2938,7 +2953,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all, mem = validize_mem (copy_rtx (stack_parm)); /* Handle values in multiple non-contiguous locations. */ - if (GET_CODE (entry_parm) == PARALLEL) + if (GET_CODE (entry_parm) == PARALLEL && !MEM_P (mem)) + emit_group_store (mem, entry_parm, data->passed_type, size); + else if (GET_CODE (entry_parm) == PARALLEL) { push_to_sequence2 (all->first_conversion_insn, all->last_conversion_insn); @@ -2946,6 +2963,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all, all->first_conversion_insn = get_insns (); all->last_conversion_insn = get_last_insn (); end_sequence (); + in_conversion_seq = true; } else if (size == 0) @@ -3025,11 +3043,22 @@ assign_parm_setup_block (struct assign_parm_data_all *all, all->first_conversion_insn = get_insns (); all->last_conversion_insn = get_last_insn (); end_sequence (); + in_conversion_seq = true; } if (target_reg) { - emit_move_insn (target_reg, stack_parm); + if (!in_conversion_seq) + emit_move_insn (target_reg, stack_parm); + else + { + push_to_sequence2 (all->first_conversion_insn, + all->last_conversion_insn); + emit_move_insn (target_reg, stack_parm); + all->first_conversion_insn = get_insns (); + all->last_conversion_insn = get_last_insn (); + end_sequence (); + } stack_parm = target_reg; } -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer