2014-06-20 20:14 GMT+02:00 Jeff Law <l...@redhat.com>:
> On 06/20/14 12:07, Kai Tietz wrote:
>>
>> 2014-06-20 19:55 GMT+02:00 Richard Henderson <r...@redhat.com>:
>>>
>>> On 06/20/2014 10:52 AM, Kai Tietz wrote:
>>>>
>>>> 2014-06-20  Kai Tietz  <kti...@redhat.com>
>>>>
>>>>      PR target/39284
>>>>      * passes.def (peephole2): Add second peephole2 pass before
>>>>      split before sched2 pass.
>>>>      * config/i386/i386.md (peehole2): To combine
>>>>      indirect jump with memory.
>>>>      (split2): Likewise.
>>>
>>>
>>> Why are we adding a second pass instead of just moving the one?
>>>
>>>
>>> r~
>>
>>
>> I told that in a prior mail in that thread to Jeff. IIRC there are
>> some conversion of impossible pushes then done too late, additional
>> some patterns about split & dieing register too.  Means we produce
>> weaker code.
>
> So let's dig into this deeper.  Examples & explanations would help.  I know
> it feels like a bit of a runaround, but avoiding adding the pass would be
> good.
>
> jeff

I dug into it a bit.  And couldn't find any significant difference for
x64 target for existing testcases.
I am still a bit concerned - I can't reproduce it for x86/x86_64
targets - that we might cause regressions for targets by moving
peephole2 pass too close before the sched2 pass.  Therefore I searched
for the closest place to the prior place of the peephole2 pass, which
solves still the indirect jump optimization on memory.
By testing for x86/x64 the pass needs to be run directly after the
"reorder blocks" pass.

So I suggest following change of passes.def:

Index: passes.def
===================================================================
--- passes.def  (Revision 211850)
+++ passes.def  (Arbeitskopie)
@@ -384,7 +384,6 @@ along with GCC; see the file COPYING3.  If not see
          NEXT_PASS (pass_rtl_dse2);
          NEXT_PASS (pass_stack_adjustments);
          NEXT_PASS (pass_jump2);
-         NEXT_PASS (pass_peephole2);
          NEXT_PASS (pass_if_after_reload);
          NEXT_PASS (pass_regrename);
          NEXT_PASS (pass_cprop_hardreg);
@@ -391,6 +390,7 @@ along with GCC; see the file COPYING3.  If not see
          NEXT_PASS (pass_fast_rtl_dce);
          NEXT_PASS (pass_duplicate_computed_gotos);
          NEXT_PASS (pass_reorder_blocks);
+         NEXT_PASS (pass_peephole2);
          NEXT_PASS (pass_branch_target_load_optimize2);
          NEXT_PASS (pass_leaf_regs);
          NEXT_PASS (pass_split_before_sched2);

Kai

Reply via email to