On Sat, Aug 20, 2011 at 2:16 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Sat, Aug 20, 2011 at 2:09 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>
>> Don't expand RORX through ix86_expand_binary_operator, generate it
>> directly from expander. You are complicating things with splitters too
>> much!
>>
>> I will rewrite this part of i386.md.
>
> So, attached RFC patch handles BMI2 mul, shift and ror stuff.
>
> Some remarks:
> - M and N register modifiers are added to print low and high register
> of a double word register pair. This is needed for mulx insn.
> - ishiftx and rotatex instruction type attributes are added.
> - "w" mode attribute is added to add register prefix for word mode.
> This is needed to output QImode count register of shift insns.
>
> - mulx is expanded directly from expander, IMO it is always a win to
> generate this insn if available.
>
> - Yb register constraint is added to conditionally enable generation
> of BMI alternatives in generic shift and rotate patterns. The BMI
> variant is generated only if RA chooses it as the most profitable
> alternative.
> - shift and rotate instructions are split post-reload from generic
> patterns to strip flags clobber.
> - zero-extended 64bit variants are also handled for shift and rotate insns.
> - rotate right AND rotate left instructions are handled through rorx.
>
> 2011-08-20  Uros Bizjak  <ubiz...@gmail.com>
>
>        * config/i386/i386.md (type): Add ishiftx and rotatex.
>        (length_immediate): Handle ishiftx and rotatex.
>        (imm_disp): Ditto.
>        (w): New mode attribute.
>
>        (mul<mode><dwi>3): Split from <u>mul<mode><dwi>3.
>        (umul<mode><dwi>3): Ditto.  Generate bmi2_umul<mode><dwi>3_1 pattern
>        for TARGET_BMI2.
>        (bmi2_umul<mode><dwi>3_1): New insn pattern.
>
>        (*bmi2_ashl<mode>3_1): New insn pattern.
>        (*ashl<mode>3_1): Add ishiftx BMI2 alternative.
>        (*ashl<mode>3_1 splitter): New splitter to avoid flags dependency.
>        (*bmi2_ashlsi3_1_zext): New insn pattern.
>        (*ashlsi3_1_zext): Add ishiftx BMI2 alternative.
>        (*ashlsi3_1_zext splitter): New splitter to avoid flags dependency.
>
>        (*bmi2_<shiftrt_insn><mode>3_1): New insn pattern.
>        (*<shiftrt_insn><mode>3_1): Add ishiftx BMI2 alternative.
>        (*<shiftrt_insn><mode>3_1 splitter): New splitter to avoid
>        flags dependency.
>        (*bmi2_<shiftrt_insn>si3_1_zext): New insn pattern.
>        (*<shiftrt_insn>si3_1_zext): Add ishiftx BMI2 alternative.
>        (*<shiftrt_insn>si3_1_zext splitter): New splitter to avoid
>        flags dependency.
>
>        (*bmi2_rorx<mode>3_1): New insn pattern.
>        (*<rotate_insn><mode>3_1): Add rotatex BMI2 alternative.
>        (*rotate<mode>3_1 splitter): New splitter to avoid flags dependency.
>        (*rotatert<mode>3_1 splitter): Ditto.
>        (*bmi2_rorxsi3_1_zext): New insn pattern.
>        (*<rotate_insn>si3_1_zext): Add rotatex BMI2 alternative.
>        (*rotatesi3_1_zext  splitter): New splitter to avoid flags dependency.
>        (*rotatertsi3_1_zext splitter): Ditto.
>
>        * config/i386/constraints.md (Yb): New register constraint.
>        * config/i386/i386.c (print_reg): Handle 'M' and 'N' modifiers.
>        (print_operand): Ditto.
>
> The patch is currently in RFC/RFT state, since I have no way to
> properly test it. The patch bootstraps OK and regression test is clean

We are using HSW emulator (SDE):

http://software.intel.com/en-us/articles/pre-release-license-agreement-for-intel-software-development-emulator-accept-end-user-license-agreement-and-download/

to test FMA, BMI/BMI2.  I have a SDE sim for dejagnu so that I can run
GCC testsuite under SDE.

> on x86_64-pc-linux-gnu {,-m32}. I tested the patch lightly on provided
> testcases, so expected patterns are generated. Oh, and all insn
> constraints should be changed from TARGET_BMI to TARGET_BMI2.
>
> Uros.
>

We can also implement MULX with split:

(define_split
  [(parallel [(set (match_operand:<DWI> 0 "register_operand" "")
                   (mult:<DWI>
                     (zero_extend:<DWI>
                       (match_operand:DWIH 1 "nonimmediate_operand" ""))
                     (zero_extend:<DWI>
                       (match_operand:DWIH 2 "nonimmediate_operand" ""))))
              (clobber (reg:CC FLAGS_REG))])]
  "TARGET_BMI2
   && ix86_binary_operator_ok (MULT, <MODE>mode, operands)"
  [(set (match_operand:<DWI> 0 "register_operand" "")
        (mult:<DWI>
          (zero_extend:<DWI>
            (match_operand:DWIH 1 "register_operand" ""))
          (zero_extend:<DWI>
            (match_operand:DWIH 2 "nonimmediate_operand" ""))))])

(define_insn "*bmi2_umul<mode><dwi>3_1"
  [(set (match_operand:<DWI> 0 "register_operand" "=r")
        (mult:<DWI>
          (zero_extend:<DWI>
            (match_operand:DWIH 1 "register_operand" "d"))
          (zero_extend:<DWI>
            (match_operand:DWIH 2 "nonimmediate_operand" "rm"))))]
  "TARGET_BMI2"
{
  if (<MODE>mode == DImode)
    return "mulx\t{%2, %M0, %N0|%N0, %M0, %2}";
  else
    return "mulx\t{%2, %M0, %N0|%N0, %M0, %2}";
}
  [(set_attr "type" "imul")
   (set_attr "prefix" "vex")
   (set_attr "mode" "<MODE>")])

-- 
H.J.

Reply via email to