Re: Unwanted IRA copies via process_reg_shuffles

Vladimir Makarov Fri, 05 Feb 2010 08:59:18 -0800

Jeff Law wrote:

I was looking at a regression caused by having ira-reload utilize theexisting copy detection code in IRA rather than my own and stumbledupon this...
Consider this insn prior to IRA:

(insn 72 56 126 8 j.c:744 (parallel [
            (set (reg:SI 110)
                (minus:SI (reg:SI 69 [ ew_u$parts$lsw ])
                    (reg:SI 68 [ ew_u$parts$lsw ])))
            (clobber (reg:CC 17 flags))
]) 290 {*subsi_1} (expr_list:REG_DEAD (reg:SI 69 [ew_u$parts$lsw ])
        (expr_list:REG_DEAD (reg:SI 68 [ ew_u$parts$lsw ])
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))

Which matches this pattern in the x86 backend:

(define_insn "*sub<mode>_1"
  [(set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m,<r>")
        (minus:SWI
          (match_operand:SWI 1 "nonimmediate_operand" "0,0")
          (match_operand:SWI 2 "<general_operand>" "<r><i>,<r>m")))
   (clobber (reg:CC FLAGS_REG))]
  "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)"
  "sub{<imodesuffix>}\t{%2, %0|%0, %2}"
  [(set_attr "type" "alu")
   (set_attr "mode" "<MODE>")])
Note carefully that the constraints require operands 0 and 1 tomatch. Operand 2 is not tied to any other operand. In fact, ifoperand 2 is tied to operand0, then we are guaranteed to generate areload and muck up the code pretty badly.
Now looking at the copies recorded by IRA we have:


  cp0:a0(r95)<->a1(r70)@248:move
  cp1:a4(r69)<->a6(r110)@178:constraint
  cp2:a3(r68)<->a6(r110)@22:shuffle
  cp3:a9(r66)<->a10(r92)@11:shuffle
  cp4:a11(r79)<->a12(r93)@89:move
  cp5:a1(r70)<->a14(r96)@114:constraint
  cp6:a1(r70)<->a13(r97)@114:constraint
Note carefully cp2 which claims a shuffle-copy between r68 and r110.ISTM that when trying to assign a hard reg to pseudo r110 that if r68has a hard reg, but r69 does not, then pseudo r110 will show a costsavings if it is allocated into the same hard reg as pseudo r68.
The problematic code is add_insn_allocno_copies:
   {
      extract_insn (insn);
      for (i = 0; i < recog_data.n_operands; i++)
        {
          operand = recog_data.operand[i];
          if (REG_SUBREG_P (operand)
&& find_reg_note (insn, REG_DEAD,
                                REG_P (operand)
? operand : SUBREG_REG (operand)) !=NULL_RTX)
            {
              str = recog_data.constraints[i];
              while (*str == ' ' || *str == '\t')
                str++;
              bound_p = false;
              for (j = 0, commut_p = false; j < 2; j++, commut_p = true)
                if ((dup = get_dup (i, commut_p)) != NULL_RTX
&& REG_SUBREG_P (dup)
&& process_regs_for_copy (operand, dup, true,
                                              NULL_RTX, freq))
                  bound_p = true;
              if (bound_p)
                continue;
              /* If an operand dies, prefer its hard register for the
                 output operands by decreasing the hard register cost
                 or creating the corresponding allocno copies.  The
                 cost will not correspond to a real move insn cost, so
                 make the frequency smaller.  */
              process_reg_shuffles (operand, i, freq < 8 ? 1 : freq / 8);
            }
        }
With r68 dying and not bound to an another operand, we create areg-shuffle copy to encourage tying r68 to the output operand. Not good.
ISTM that if an output is already bound to some input that we shouldnot be recording a copy between an unbound dying input and the boundoutput.
You can play with the attached (meaningless) testcase -O2 -m32 -fPIC.You won't see code quality regressions due to this problem on thistestcase with the mainline sources, but it should be enough to triggerthe bogus copy.
Thoughts?

Yes, Jeff, you are probably right that we should make the shuffle copywhen there is already a constraint copy to the operand. There is noregression on the test because shuffle copy has always the smallercost. I introduced the shuffle copies because they improved SPEC2000rates for x86/x86_64 but I did not try to write a code to remove thesituation you found.

I think we should try to remove shuffle copy (or even discorage suchshuffle through negative copy cost) and see what happened to SPEC rates.

Re: Unwanted IRA copies via process_reg_shuffles

Reply via email to