This patch adds the stub support to allow users to use -mcpu=power9, so that in the future it will generate code for the Power9 systems (ISA 3.0). At this time, the stub only sets up the switches. Future patches to GCC 6.x and later GCC 7.x will add support for various features in power9.
I have bootstrapped this on a big endian power7 system and a little endian power8 system with no regressions. Is this patch ok to install in the trunk? I would also like to back port this initial support into GCC 5.x. Is that ok as well? 2015-11-03 Michael Meissner <meiss...@linux.vnet.ibm.com> * config/rs6000/rs6000.opt (-mfusion-toc): Add new switches for ISA 3.0 (power9). (-mpower9-fusion): Likewise. (-mpower9-vector): Likewise. (-mmodulo): Likewise. (-mfloat128-hardware): Likewise. * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Add option mask for ISA 3.0 (power9). (POWERPC_MASKS): Add new ISA 3.0 switches. (power9 cpu): Add power9 cpu. * config/rs6000/rs6000.h (ASM_CPU_POWER9_SPEC): Add support for power9. (ASM_CPU_SPEC): Likewise. (EXTRA_SPECS): Likewise. * config/rs6000/rs6000.c (power9_cost): Initial cost setup for power9. (rs6000_debug_reg_global): Add support for power9 fusion. (rs6000_setup_reg_addr_masks): Cache mode size. (rs6000_option_override_internal): Until real power9 tuning is added, use -mtune=power8 for -mcpu=power9. (rs6000_option_override_internal): Add support for ISA 3.0 switches. (rs6000_loop_align): Add support for power9 cpu. (rs6000_file_start): Likewise. (rs6000_adjust_cost): Likewise. (rs6000_issue_rate): Likewise. (insn_must_be_first_in_group): Likewise. (insn_must_be_last_in_group): Likewise. (force_new_group): Likewise. (rs6000_register_move_cost): Likewise. (rs6000_opt_masks): Likewise. * config/rs6000/rs6000.md (cpu attribute): Add power9. * config/rs6000/rs6000-tables.opt: Regenerate. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define _ARCH_PWR9 if power9 support is available. * config/rs6000/aix61.h (ASM_CPU_SPEC): Add power9. * config/rs6000/aix53.h (ASM_CPU_SPEC): Likewise. * configure.ac: Determine if the assembler supports the ISA 3.0 instructions. * config.in (HAVE_AS_POWER9): Likewise. * configure: Regenerate. * doc/invoke.texi (RS/6000 and PowerPC Options): Document ISA 3.0 switches. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 229674) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -561,6 +561,10 @@ mpower8-vector Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags) Use/do not use vector and scalar instructions added in ISA 2.07. +mfusion-toc +Target Undocumented Mask(FUSION_TOC) Var(rs6000_isa_flags) +Fuse medium/large code model toc references to the memory instruction. + mcrypto Target Report Mask(CRYPTO) Var(rs6000_isa_flags) Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions. @@ -601,6 +605,22 @@ moptimize-swaps Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save Analyze and remove doubleword swaps from VSX computations. +mpower9-fusion +Target Report Mask(P9_FUSION) Var(rs6000_isa_flags) +Fuse certain operations together for better performance on power9. + +mpower9-vector +Target Report Mask(P9_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector and scalar instructions added in ISA 2.08. + +mmodulo +Target Report Mask(MODULO) Var(rs6000_isa_flags) +Generate the integer modulo instructions. + mfloat128 Target Report Mask(FLOAT128) Var(rs6000_isa_flags) Enable/disable IEEE 128-bit floating point via the __float128 keyword. + +mfloat128-hardware +Target Report Mask(FLOAT128_HW) Var(rs6000_isa_flags) +Enable/disable using IEEE 128-bit floating point instructions. Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (revision 229674) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -60,6 +60,14 @@ | OPTION_MASK_QUAD_MEMORY_ATOMIC \ | OPTION_MASK_UPPER_REGS_SF) +/* Add ISEL back into ISA 3.0, since it is supposed to be a win. */ +#define ISA_3_0_MASKS_SERVER (ISA_2_7_MASKS_SERVER \ + | OPTION_MASK_FLOAT128_HW \ + | OPTION_MASK_ISEL \ + | OPTION_MASK_MODULO \ + | OPTION_MASK_P9_FUSION \ + | OPTION_MASK_P9_VECTOR) + #define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC) /* Deal with ports that do not have -mstrict-align. */ @@ -83,14 +91,18 @@ | OPTION_MASK_EFFICIENT_UNALIGNED_VSX \ | OPTION_MASK_FLOAT128 \ | OPTION_MASK_FPRND \ + | OPTION_MASK_FUSION_TOC \ | OPTION_MASK_HTM \ | OPTION_MASK_ISEL \ | OPTION_MASK_MFCRF \ | OPTION_MASK_MFPGPR \ + | OPTION_MASK_MODULO \ | OPTION_MASK_MULHW \ | OPTION_MASK_NO_UPDATE \ | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ + | OPTION_MASK_P9_FUSION \ + | OPTION_MASK_P9_VECTOR \ | OPTION_MASK_POPCNTB \ | OPTION_MASK_POPCNTD \ | OPTION_MASK_POWERPC64 \ @@ -195,6 +207,7 @@ RS6000_CPU ("power7", PROCESSOR_POWER7, | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF) RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER) +RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER) RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0) RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64) RS6000_CPU ("powerpc64le", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER) Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 229674) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -95,6 +95,12 @@ #define ASM_CPU_POWER8_SPEC ASM_CPU_POWER7_SPEC #endif +#ifdef HAVE_AS_POWER9 +#define ASM_CPU_POWER9_SPEC "-mpower9" +#else +#define ASM_CPU_POWER9_SPEC ASM_CPU_POWER8_SPEC +#endif + #ifdef HAVE_AS_DCI #define ASM_CPU_476_SPEC "-m476" #else @@ -119,6 +125,7 @@ %{mcpu=power6x: %(asm_cpu_power6) -maltivec} \ %{mcpu=power7: %(asm_cpu_power7)} \ %{mcpu=power8: %(asm_cpu_power8)} \ +%{mcpu=power9: %(asm_cpu_power9)} \ %{mcpu=a2: -ma2} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc64} \ @@ -193,6 +200,7 @@ { "asm_cpu_power6", ASM_CPU_POWER6_SPEC }, \ { "asm_cpu_power7", ASM_CPU_POWER7_SPEC }, \ { "asm_cpu_power8", ASM_CPU_POWER8_SPEC }, \ + { "asm_cpu_power9", ASM_CPU_POWER9_SPEC }, \ { "asm_cpu_476", ASM_CPU_476_SPEC }, \ SUBTARGET_EXTRA_SPECS Index: gcc/config/rs6000/rs6000-opts.h =================================================================== --- gcc/config/rs6000/rs6000-opts.h (revision 229674) +++ gcc/config/rs6000/rs6000-opts.h (working copy) @@ -60,6 +60,7 @@ enum processor_type PROCESSOR_POWER6, PROCESSOR_POWER7, PROCESSOR_POWER8, + PROCESSOR_POWER9, PROCESSOR_RS64A, PROCESSOR_MPCCORE, Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 229674) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -985,6 +985,26 @@ struct processor_costs power8_cost = { COSTS_N_INSNS (3), /* SF->DF convert */ }; +/* Instruction costs on POWER9 processors. */ +static const +struct processor_costs power9_cost = { + COSTS_N_INSNS (3), /* mulsi */ + COSTS_N_INSNS (3), /* mulsi_const */ + COSTS_N_INSNS (3), /* mulsi_const9 */ + COSTS_N_INSNS (3), /* muldi */ + COSTS_N_INSNS (19), /* divsi */ + COSTS_N_INSNS (35), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (3), /* dmul */ + COSTS_N_INSNS (14), /* sdiv */ + COSTS_N_INSNS (17), /* ddiv */ + 128, /* cache line size */ + 32, /* l1 cache */ + 256, /* l2 cache */ + 12, /* prefetch streams */ + COSTS_N_INSNS (3), /* SF->DF convert */ +}; + /* Instruction costs on POWER A2 processors. */ static const struct processor_costs ppca2_cost = { @@ -2423,8 +2443,18 @@ rs6000_debug_reg_global (void) fprintf (stderr, DEBUG_FMT_S, "lra", "true"); if (TARGET_P8_FUSION) - fprintf (stderr, DEBUG_FMT_S, "p8 fusion", - (TARGET_P8_FUSION_SIGN) ? "zero+sign" : "zero"); + { + char options[80]; + + strcpy (options, (TARGET_P9_FUSION) ? "power9" : "power8"); + if (TARGET_FUSION_TOC) + strcat (options, ", toc"); + + if (TARGET_P8_FUSION_SIGN) + strcat (options, ", sign"); + + fprintf (stderr, DEBUG_FMT_S, "fusion", options); + } fprintf (stderr, DEBUG_FMT_S, "plt-format", TARGET_SECURE_PLT ? "secure" : "bss"); @@ -2463,6 +2493,7 @@ rs6000_setup_reg_addr_masks (void) for (m = 0; m < NUM_MACHINE_MODES; ++m) { machine_mode m2 = (machine_mode)m; + unsigned short msize = GET_MODE_SIZE (m2); /* SDmode is special in that we want to access it only via REG+REG addressing on power7 and above, since we want to use the LFIWZX and @@ -2496,12 +2527,12 @@ rs6000_setup_reg_addr_masks (void) if (TARGET_UPDATE && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR) - && GET_MODE_SIZE (m2) <= 8 + && msize <= 8 && !VECTOR_MODE_P (m2) && !FLOAT128_VECTOR_P (m2) && !COMPLEX_MODE_P (m2) && !indexed_only_p - && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)) + && !(TARGET_E500_DOUBLE && msize == 8)) { addr_mask |= RELOAD_REG_PRE_INCDEC; @@ -3382,7 +3413,22 @@ rs6000_option_override_internal (bool gl if (rs6000_tune_index >= 0) tune_index = rs6000_tune_index; else if (have_cpu) - rs6000_tune_index = tune_index = cpu_index; + { + /* Until power9 tuning is available, use power8 tuning if -mcpu=power9. */ + if (processor_target_table[cpu_index].processor != PROCESSOR_POWER9) + rs6000_tune_index = tune_index = cpu_index; + else + { + size_t i; + tune_index = -1; + for (i = 0; i < ARRAY_SIZE (processor_target_table); i++) + if (processor_target_table[i].processor == PROCESSOR_POWER8) + { + rs6000_tune_index = tune_index = i; + break; + } + } + } else { size_t i; @@ -3557,7 +3603,9 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-<option> to disable the code. */ - if (TARGET_P8_VECTOR || TARGET_DIRECT_MOVE || TARGET_CRYPTO) + if (TARGET_P9_VECTOR || TARGET_MODULO) + rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~rs6000_isa_flags_explicit); + else if (TARGET_P8_VECTOR || TARGET_DIRECT_MOVE || TARGET_CRYPTO) rs6000_isa_flags |= (ISA_2_7_MASKS_SERVER & ~rs6000_isa_flags_explicit); else if (TARGET_VSX) rs6000_isa_flags |= (ISA_2_6_MASKS_SERVER & ~rs6000_isa_flags_explicit); @@ -3703,6 +3751,41 @@ rs6000_option_override_internal (bool gl rs6000_isa_flags |= (processor_target_table[tune_index].target_enable & OPTION_MASK_P8_FUSION); + /* Setting additional fusion flags turns on base fusion. */ + if (!TARGET_P8_FUSION && (TARGET_P8_FUSION_SIGN || TARGET_FUSION_TOC)) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION) + { + if (TARGET_P8_FUSION_SIGN) + error ("-mpower8-fusion-sign requires -mpower8-fusion"); + + if (TARGET_FUSION_TOC) + error ("-mfusion-toc requires -mpower8-fusion"); + + rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION; + } + else + rs6000_isa_flags |= OPTION_MASK_P8_FUSION; + } + + /* Power9 fusion is a superset over power8 fusion. */ + if (TARGET_P9_FUSION && !TARGET_P8_FUSION) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION) + { + error ("-mpower9-fusion requires -mpower8-fusion"); + rs6000_isa_flags &= ~OPTION_MASK_P9_FUSION; + } + else + rs6000_isa_flags |= OPTION_MASK_P8_FUSION; + } + + /* Enable power9 fusion if we are tuning for power9, even if we aren't + generating power9 instructions. */ + if (!(rs6000_isa_flags_explicit & OPTION_MASK_P9_FUSION)) + rs6000_isa_flags |= (processor_target_table[tune_index].target_enable + & OPTION_MASK_P9_FUSION); + /* Power8 does not fuse sign extended loads with the addis. If we are optimizing at high levels for speed, convert a sign extended load into a zero extending load, and an explicit sign extension. */ @@ -3712,6 +3795,36 @@ rs6000_option_override_internal (bool gl && optimize >= 3) rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN; + /* TOC fusion requires 64-bit and medium/large code model. */ + if (TARGET_FUSION_TOC && !TARGET_POWERPC64) + { + rs6000_isa_flags &= ~OPTION_MASK_FUSION_TOC; + if ((rs6000_isa_flags_explicit & OPTION_MASK_FUSION_TOC) != 0) + warning (0, N_("-mfusion-toc requires 64-bit")); + } + + if (TARGET_FUSION_TOC && (TARGET_CMODEL == CMODEL_SMALL)) + { + rs6000_isa_flags &= ~OPTION_MASK_FUSION_TOC; + if ((rs6000_isa_flags_explicit & OPTION_MASK_FUSION_TOC) != 0) + warning (0, N_("-mfusion-toc requires medium/large code model")); + } + + /* Turn on -mfusion-toc by default if p8-fusion and 64-bit medium/large code + model. */ + if (TARGET_P8_FUSION && !TARGET_FUSION_TOC && TARGET_POWERPC64 + && (TARGET_CMODEL != CMODEL_SMALL) + && !(rs6000_isa_flags_explicit & OPTION_MASK_FUSION_TOC)) + rs6000_isa_flags |= OPTION_MASK_FUSION_TOC; + + /* ISA 2.08 vector instructions include ISA 2.07. */ + if (TARGET_P9_VECTOR && !TARGET_P8_VECTOR) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_P9_VECTOR) + error ("-mpower9-vector requires -mpower8-vector"); + rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR; + } + /* Set -mallow-movmisalign to explicitly on if we have full ISA 2.07 support. If we only have ISA 2.06 support, and the user did not specify the switch, leave it set to -1 so the movmisalign patterns are enabled, @@ -3757,9 +3870,32 @@ rs6000_option_override_internal (bool gl if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128) != 0) error ("-mfloat128 requires VSX support"); - rs6000_isa_flags &= ~OPTION_MASK_FLOAT128; + rs6000_isa_flags &= ~(OPTION_MASK_FLOAT128 | OPTION_MASK_FLOAT128_HW); + } + + /* IEEE 128-bit floating point hardware instructions imply enabling + __float128. */ + if (TARGET_FLOAT128_HW + && (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR + | OPTION_MASK_DIRECT_MOVE + | OPTION_MASK_UPPER_REGS_DF + | OPTION_MASK_UPPER_REGS_SF)) == 0) + { + if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0) + error ("-mfloat128-hardware requires full ISA 3.0 support"); + + rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW; } + else if (TARGET_P9_VECTOR && !TARGET_FLOAT128_HW + && (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) == 0) + rs6000_isa_flags |= OPTION_MASK_FLOAT128_HW; + + if (TARGET_FLOAT128_HW + && (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128) == 0) + rs6000_isa_flags |= OPTION_MASK_FLOAT128; + + /* Print the options after updating the defaults. */ if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags); @@ -3957,18 +4093,21 @@ rs6000_option_override_internal (bool gl && rs6000_cpu != PROCESSOR_POWER6 && rs6000_cpu != PROCESSOR_POWER7 && rs6000_cpu != PROCESSOR_POWER8 + && rs6000_cpu != PROCESSOR_POWER9 && rs6000_cpu != PROCESSOR_PPCA2 && rs6000_cpu != PROCESSOR_CELL && rs6000_cpu != PROCESSOR_PPC476); rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4 || rs6000_cpu == PROCESSOR_POWER5 || rs6000_cpu == PROCESSOR_POWER7 - || rs6000_cpu == PROCESSOR_POWER8); + || rs6000_cpu == PROCESSOR_POWER8 + || rs6000_cpu == PROCESSOR_POWER9); rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4 || rs6000_cpu == PROCESSOR_POWER5 || rs6000_cpu == PROCESSOR_POWER6 || rs6000_cpu == PROCESSOR_POWER7 || rs6000_cpu == PROCESSOR_POWER8 + || rs6000_cpu == PROCESSOR_POWER9 || rs6000_cpu == PROCESSOR_PPCE500MC || rs6000_cpu == PROCESSOR_PPCE500MC64 || rs6000_cpu == PROCESSOR_PPCE5500 @@ -4216,6 +4355,10 @@ rs6000_option_override_internal (bool gl rs6000_cost = &power8_cost; break; + case PROCESSOR_POWER9: + rs6000_cost = &power9_cost; + break; + case PROCESSOR_PPCA2: rs6000_cost = &ppca2_cost; break; @@ -4396,7 +4539,8 @@ rs6000_loop_align (rtx label) || rs6000_cpu == PROCESSOR_POWER5 || rs6000_cpu == PROCESSOR_POWER6 || rs6000_cpu == PROCESSOR_POWER7 - || rs6000_cpu == PROCESSOR_POWER8)) + || rs6000_cpu == PROCESSOR_POWER8 + || rs6000_cpu == PROCESSOR_POWER9)) return 5; else return align_loops_log; @@ -5213,7 +5357,9 @@ rs6000_file_start (void) || !global_options_set.x_rs6000_cpu_index) { fputs ("\t.machine ", asm_out_file); - if ((rs6000_isa_flags & OPTION_MASK_DIRECT_MOVE) != 0) + if ((rs6000_isa_flags & OPTION_MASK_MODULO) != 0) + fputs ("power9\n", asm_out_file); + else if ((rs6000_isa_flags & OPTION_MASK_DIRECT_MOVE) != 0) fputs ("power8\n", asm_out_file); else if ((rs6000_isa_flags & OPTION_MASK_POPCNTD) != 0) fputs ("power7\n", asm_out_file); @@ -28006,6 +28152,7 @@ rs6000_adjust_cost (rtx_insn *insn, rtx || rs6000_cpu_attr == CPU_POWER5 || rs6000_cpu_attr == CPU_POWER7 || rs6000_cpu_attr == CPU_POWER8 + || rs6000_cpu_attr == CPU_POWER9 || rs6000_cpu_attr == CPU_CELL) && recog_memoized (dep_insn) && (INSN_CODE (dep_insn) >= 0)) @@ -28578,6 +28725,7 @@ rs6000_issue_rate (void) case CPU_POWER7: return 5; case CPU_POWER8: + case CPU_POWER9: return 7; default: return 1; @@ -29211,6 +29359,7 @@ insn_must_be_first_in_group (rtx_insn *i } break; case PROCESSOR_POWER8: + case PROCESSOR_POWER9: type = get_attr_type (insn); switch (type) @@ -29341,6 +29490,7 @@ insn_must_be_last_in_group (rtx_insn *in } break; case PROCESSOR_POWER8: + case PROCESSOR_POWER9: type = get_attr_type (insn); switch (type) @@ -29459,7 +29609,7 @@ force_new_group (int sched_verbose, FILE /* Do we have a special group ending nop? */ if (rs6000_cpu_attr == CPU_POWER6 || rs6000_cpu_attr == CPU_POWER7 - || rs6000_cpu_attr == CPU_POWER8) + || rs6000_cpu_attr == CPU_POWER8 || rs6000_cpu_attr == CPU_POWER9) { nop = gen_group_ending_nop (); emit_insn_before (nop, next_insn); @@ -31959,7 +32109,8 @@ rs6000_register_move_cost (machine_mode expensive than memory in order to bias spills to memory .*/ else if ((rs6000_cpu == PROCESSOR_POWER6 || rs6000_cpu == PROCESSOR_POWER7 - || rs6000_cpu == PROCESSOR_POWER8) + || rs6000_cpu == PROCESSOR_POWER8 + || rs6000_cpu == PROCESSOR_POWER9) && reg_classes_intersect_p (rclass, LINK_OR_CTR_REGS)) ret = 6 * hard_regno_nregs[0][mode]; @@ -33489,12 +33640,15 @@ static struct rs6000_opt_mask const rs60 { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX, false, true }, { "float128", OPTION_MASK_FLOAT128, false, true }, + { "float128-hardware", OPTION_MASK_FLOAT128_HW, false, true }, { "fprnd", OPTION_MASK_FPRND, false, true }, + { "fusion-toc", OPTION_MASK_FUSION_TOC, false, true }, { "hard-dfp", OPTION_MASK_DFP, false, true }, { "htm", OPTION_MASK_HTM, false, true }, { "isel", OPTION_MASK_ISEL, false, true }, { "mfcrf", OPTION_MASK_MFCRF, false, true }, { "mfpgpr", OPTION_MASK_MFPGPR, false, true }, + { "modulo", OPTION_MASK_MODULO, false, true }, { "mulhw", OPTION_MASK_MULHW, false, true }, { "multiple", OPTION_MASK_MULTIPLE, false, true }, { "popcntb", OPTION_MASK_POPCNTB, false, true }, @@ -33502,6 +33656,8 @@ static struct rs6000_opt_mask const rs60 { "power8-fusion", OPTION_MASK_P8_FUSION, false, true }, { "power8-fusion-sign", OPTION_MASK_P8_FUSION_SIGN, false, true }, { "power8-vector", OPTION_MASK_P8_VECTOR, false, true }, + { "power9-fusion", OPTION_MASK_P9_FUSION, false, true }, + { "power9-vector", OPTION_MASK_P9_VECTOR, false, true }, { "powerpc-gfxopt", OPTION_MASK_PPC_GFXOPT, false, true }, { "powerpc-gpopt", OPTION_MASK_PPC_GPOPT, false, true }, { "quad-memory", OPTION_MASK_QUAD_MEMORY, false, true }, Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 229674) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -252,7 +252,7 @@ (define_attr "cpu" ppc750,ppc7400,ppc7450, ppc403,ppc405,ppc440,ppc476, ppc8540,ppc8548,ppce300c2,ppce300c3,ppce500mc,ppce500mc64,ppce5500,ppce6500, - power4,power5,power6,power7,power8, + power4,power5,power6,power7,power8,power9, rs64a,mpccore,cell,ppca2,titan" (const (symbol_ref "rs6000_cpu_attr"))) Index: gcc/config/rs6000/rs6000-tables.opt =================================================================== --- gcc/config/rs6000/rs6000-tables.opt (revision 229674) +++ gcc/config/rs6000/rs6000-tables.opt (working copy) @@ -180,14 +180,17 @@ EnumValue Enum(rs6000_cpu_opt_value) String(power8) Value(50) EnumValue -Enum(rs6000_cpu_opt_value) String(powerpc) Value(51) +Enum(rs6000_cpu_opt_value) String(power9) Value(51) EnumValue -Enum(rs6000_cpu_opt_value) String(powerpc64) Value(52) +Enum(rs6000_cpu_opt_value) String(powerpc) Value(52) EnumValue -Enum(rs6000_cpu_opt_value) String(powerpc64le) Value(53) +Enum(rs6000_cpu_opt_value) String(powerpc64) Value(53) EnumValue -Enum(rs6000_cpu_opt_value) String(rs64) Value(54) +Enum(rs6000_cpu_opt_value) String(powerpc64le) Value(54) + +EnumValue +Enum(rs6000_cpu_opt_value) String(rs64) Value(55) Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (revision 229674) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -349,6 +349,8 @@ rs6000_target_modify_macros (bool define rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7"); if ((flags & OPTION_MASK_DIRECT_MOVE) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8"); + if ((flags & OPTION_MASK_MODULO) != 0) + rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9"); if ((flags & OPTION_MASK_SOFT_FLOAT) != 0) rs6000_define_or_undefine_macro (define_p, "_SOFT_FLOAT"); if ((flags & OPTION_MASK_RECIP_PRECISION) != 0) Index: gcc/config/rs6000/aix61.h =================================================================== --- gcc/config/rs6000/aix61.h (revision 229674) +++ gcc/config/rs6000/aix61.h (working copy) @@ -80,6 +80,7 @@ do { \ %{mcpu=power6x: -mpwr6} \ %{mcpu=power7: -mpwr7} \ %{mcpu=power8: -mpwr8} \ +%{mcpu=power9: -mpwr9} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ Index: gcc/config/rs6000/aix53.h =================================================================== --- gcc/config/rs6000/aix53.h (revision 229674) +++ gcc/config/rs6000/aix53.h (working copy) @@ -63,6 +63,7 @@ do { \ %{mcpu=power6x: -mpwr6} \ %{mcpu=power7: -mpwr7} \ %{mcpu=power8: -mpwr8} \ +%{mcpu=power9: -mpwr9} \ %{mcpu=powerpc: -mppc} \ %{mcpu=rs64a: -mppc} \ %{mcpu=603: -m603} \ Index: gcc/configure.ac =================================================================== --- gcc/configure.ac (revision 229674) +++ gcc/configure.ac (working copy) @@ -4326,6 +4326,19 @@ LCF0: [Define if your assembler supports POWER8 instructions.])]) case $target in + *-*-aix*) conftest_s=' .machine "pwr9" + .csect .text[[PR]]';; + *) conftest_s=' .machine power9 + .text';; + esac + + gcc_GAS_CHECK_FEATURE([power9 support], + gcc_cv_as_powerpc_power9, [2,19,2], -a32, + [$conftest_s],, + [AC_DEFINE(HAVE_AS_POWER9, 1, + [Define if your assembler supports POWER9 instructions.])]) + + case $target in *-*-aix*) conftest_s=' .csect .text[[PR]] lwsync';; *) conftest_s=' .text Index: gcc/configure =================================================================== --- gcc/configure (revision 229674) +++ gcc/configure (working copy) @@ -26315,6 +26315,48 @@ $as_echo "#define HAVE_AS_POWER8 1" >>co fi case $target in + *-*-aix*) conftest_s=' .machine "pwr9" + .csect .text[PR]';; + *) conftest_s=' .machine power9 + .text';; + esac + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for power9 support" >&5 +$as_echo_n "checking assembler for power9 support... " >&6; } +if test "${gcc_cv_as_powerpc_power9+set}" = set; then : + $as_echo_n "(cached) " >&6 +else + gcc_cv_as_powerpc_power9=no + if test $in_tree_gas = yes; then + if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2` + then gcc_cv_as_powerpc_power9=yes +fi + elif test x$gcc_cv_as != x; then + $as_echo "$conftest_s" > conftest.s + if { ac_try='$gcc_cv_as $gcc_cv_as_flags -a32 -o conftest.o conftest.s >&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; } + then + gcc_cv_as_powerpc_power9=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_powerpc_power9" >&5 +$as_echo "$gcc_cv_as_powerpc_power9" >&6; } +if test $gcc_cv_as_powerpc_power9 = yes; then + +$as_echo "#define HAVE_AS_POWER9 1" >>confdefs.h + +fi + + case $target in *-*-aix*) conftest_s=' .csect .text[PR] lwsync';; *) conftest_s=' .text Index: gcc/config.in =================================================================== --- gcc/config.in (revision 229674) +++ gcc/config.in (working copy) @@ -563,6 +563,12 @@ #endif +/* Define if your assembler supports POWER9 instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_POWER9 +#endif + + /* Define if your assembler supports .ref */ #ifndef USED_FOR_TARGET #undef HAVE_AS_REF Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 229674) +++ gcc/doc/invoke.texi (working copy) @@ -946,8 +946,9 @@ See RS/6000 and PowerPC Options. -mquad-memory-atomic -mno-quad-memory-atomic @gol -mcompat-align-parm -mno-compat-align-parm @gol -mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol --mupper-regs -mno-upper-regs @gol --mfloat128 -mno-float128} +-mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol +-mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol +-mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -19275,8 +19276,9 @@ Supported values for @var{cpu_type} are @samp{e300c3}, @samp{e500mc}, @samp{e500mc64}, @samp{e5500}, @samp{e6500}, @samp{ec603e}, @samp{G3}, @samp{G4}, @samp{G5}, @samp{titan}, @samp{power3}, @samp{power4}, @samp{power5}, @samp{power5+}, -@samp{power6}, @samp{power6x}, @samp{power7}, @samp{power8}, @samp{powerpc}, -@samp{powerpc64}, @samp{powerpc64le}, and @samp{rs64}. +@samp{power6}, @samp{power6x}, @samp{power7}, @samp{power8}, +@samp{power9}, @samp{powerpc}, @samp{powerpc64}, @samp{powerpc64le}, +and @samp{rs64}. @option{-mcpu=powerpc}, @option{-mcpu=powerpc64}, and @option{-mcpu=powerpc64le} specify pure 32-bit PowerPC (either @@ -19296,7 +19298,8 @@ following options: -mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol -msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx @gol -mcrypto -mdirect-move -mpower8-fusion -mpower8-vector @gol --mquad-memory -mquad-memory-atomic} +-mquad-memory -mquad-memory-atomic -mmodulo -mfloat128 -mfloat128-hardware @gol +-mpower9-fusion -mpower9-vector} The particular options set for any particular CPU varies between compiler versions, depending on what setting seems to produce optimal @@ -19533,12 +19536,45 @@ If the @option{-mno-upper-regs} option i @opindex mfloat128 @opindex mno-float128 Enable/disable the @var{__float128} keyword for IEEE 128-bit floating point -and use software emulation for IEEE 128-bit floating point. +and use either software emulation for IEEE 128-bit floating point or +hardware instructions. The VSX instruction set (@option{-mvsx}, @option{-mcpu=power7}, or @option{-mcpu=power8}) must be enabled to use the @option{-mfloat128} option. +@item -mfloat128-hardware +@itemx -mno-float128-hardware +@opindex mfloat128-hardware +@opindex mno-float128-hardware +Enable/disable using ISA 3.0 hardware instructions to support the +@var{__float128} data type. + +@item -mmodulo +@itemx -mno-modulo +@opindex mmodulo +@opindex mno-module +Generate code that uses (does not use) the ISA 2.08 integer modulo +instructions. The @option{-mmodulo} option is enabled by default +with the @option{-mcpu=power9} option. + +@item -mpower9-fusion +@itemx -mno-power9-fusion +@opindex mpower9-fusion +@opindex mno-power9-fusion +Generate code that keeps (does not keeps) some operations adjacent so +that the instructions can be fused together on power9 and later +processors. + +@item -mpower9-vector +@itemx -mno-power9-vector +@opindex mpower9-vector +@opindex mno-power9-vector +Generate code that uses (does not use) the vector and scalar +instructions that were added in version 2.07 of the PowerPC ISA. Also +enable the use of built-in functions that allow more direct access to +the vector instructions. + @item -mfloat-gprs=@var{yes/single/double/no} @itemx -mfloat-gprs @opindex mfloat-gprs