Quoting Richard Sandiford <rdsandif...@googlemail.com>:
Joern Rennecke <joern.renne...@embecosm.com> writes:
Quoting Richard Sandiford <rdsandif...@googlemail.com>:
The fact that we even have shared unique ids is pretty bad -- and surely
a contradiction in terms -- but I think both ways of handling them rely
on the length being the same for all copies. If we don't record a length
(your version), we won't set something_changed if the length of this copy
did in fact change, which could lead to branches remaining too short.
That is not the case, as the current length of the inner insn is added to
new_length (of the outer insn).
If we do record a length (current version),
In the current version, the length of the inner insn is calculated anew
each iteration, so only the out-of-sequence copy suffers from the bogosity.
we could end up changing
the length of previous copies that have already been processed in
this iteration.
Both that, and also using the length or the previous insn for the inner
length calculation, and ratcheting them up together.
In fact, this can make delay slot instructions ineligible for the delay slot
they are in. Also, the unwanted length cross-interference can violate
alignment invariants, and this leads to invalid code on ARCompact.
But if you're saying that ARCompat wants different copies of the same insn
(with the same uid) to have different lengths, then please fix dbr_schedule
so that it uses different uids. The "i == 0" thing is just too hackish IMO.
Implemented with the attached patch.
I've been wondering if I should use copy_insn, but that doesn't seem to make
any real difference after reload.
Also, copying only PATTERN, INSN_LOCATION, and REG_NOTES into the new
insn obtained from make_insn_raw had seemed a possibility, but there is no
firm assurance that we will only see insn with the code INSN.
In the end, continuing to use copy_rtx to make the copy seems least likely to
break something. And now that the copying is in one place, it's easier to
change the way it is done if you want to try that.
We waste a bit of memory (temporarily, as it is garbage collected) by using
make_insn_raw just to increment cur_insn_uid. Alternatives would be to
move the function inside emit-rtl.c, or have emit-rtl.h provide a means to
increment cur_insn_uid. Well, or we could directly access
crtl->emit.x_cur_insn_uid, but that would certainly be hackish.
regression tested on i686-pc-linux-gnu X cris-elf
(cris-elf has delayed branches, CASE_VECTOR_SHORTEN_MODE, is not
currently affected by PR54938, and builds libgfortran.)
C / C++ / objc regression tests for mipsel-elf also passed.
2012-10-19 Joern Rennecke <joern.renne...@embecosm.com>
* reorg.c (copy_delay_slot_insn): New function.
(steal_delay_list_from_target, fill_slots_from_thread): Use it.
(fill_simple_delay_slots): Likewise.
Index: reorg.c
===================================================================
--- reorg.c (revision 192573)
+++ reorg.c (working copy)
@@ -1179,6 +1179,19 @@ check_annul_list_true_false (int annul_t
return 1;
}
+/* Copy INSN, which is to remain in place (in fill_slots_from_thread parlance,
+ it is from a thread we don't own), for use in a delay slot. */
+static rtx
+copy_delay_slot_insn (rtx insn)
+{
+ /* Copy INSN with its rtx_code, all its notes, location etc. */
+ insn = copy_rtx (insn);
+ /* Get new UID. */
+ rtx new_insn = make_insn_raw (PATTERN (insn));
+ INSN_UID (insn) = INSN_UID (new_insn);
+ return insn;
+}
+
/* INSN branches to an insn whose pattern SEQ is a SEQUENCE. Given that
the condition tested by INSN is CONDITION and the resources shown in
OTHER_NEEDED are needed after INSN, see whether INSN can take all the insns
@@ -1297,7 +1310,7 @@ steal_delay_list_from_target (rtx insn,
{
if (must_annul)
used_annul = 1;
- temp = copy_rtx (trial);
+ temp = copy_delay_slot_insn (trial);
INSN_FROM_TARGET_P (temp) = 1;
new_delay_list = add_to_delay_list (temp, new_delay_list);
total_slots_filled++;
@@ -2369,7 +2382,8 @@ fill_simple_delay_slots (int non_jumps_p
if (new_label)
{
delay_list
- = add_to_delay_list (copy_rtx (next_trial), delay_list);
+ = add_to_delay_list (copy_delay_slot_insn (next_trial),
+ delay_list);
slots_filled++;
reorg_redirect_jump (trial, new_label);
@@ -2793,7 +2807,7 @@ fill_slots_from_thread (rtx insn, rtx co
else
new_thread = next_active_insn (trial);
- temp = own_thread ? trial : copy_rtx (trial);
+ temp = own_thread ? trial : copy_delay_slot_insn (trial);
if (thread_if_true)
INSN_FROM_TARGET_P (temp) = 1;
@@ -2974,7 +2988,7 @@ fill_slots_from_thread (rtx insn, rtx co
else
new_thread = next_active_insn (trial);
- ninsn = own_thread ? trial : copy_rtx (trial);
+ ninsn = own_thread ? trial : copy_delay_slot_insn (trial);
if (thread_if_true)
INSN_FROM_TARGET_P (ninsn) = 1;