mmaplus2)] Add support for dense math registers.

Michael Meissner via Gcc-cvs Tue, 03 Mar 2026 17:14:53 -0800

https://gcc.gnu.org/g:b79e117218c5b857590827f834cb3b659c001472


commit b79e117218c5b857590827f834cb3b659c001472
Author: Michael Meissner <[email protected]>
Date:   Tue Mar 3 20:09:33 2026 -0500

    Add support for dense math registers.
    
    This patch adds basic support for dense math registers.  It includes 
support for
    moving values to/from dense registers.  The MMA instructions are not yet
    modified to know about dense math registers.  The -mcpu=future option does 
not
    set -mdense-math in this patch.  A future patch will make these changes.
    
    The changes include:
    
       1:   XOmode moves include moving to/from dense math registers.
    
       2:   Add predicate dense_math_operand.
    
       3:   Make the predicate accumulator_operand match on dense math 
registers.
    
       4:   Add dense math register class.
    
       5:   Add the 8 dense math register accumulators with internal register
            numbers 111-118.
    
       6:   Make the 'wD' constraint match dense math register if -mdense-math, 
and
            4 adjacent VSX register if -mno-dense-math is in effect.
    
       7:   Set up the reload information so that the register allocator knows 
that
            dense math registers do not have load or store instructions.  
Instead to
            read/write dense math registers, you have to use VSX registers as
            intermediaries.
    
       8:   Make the print_operand '%A' output operand now knows about 
accumulators
            in dense math registrs and accumulators in 4 adjacent VSX registers.
    
       9:   Update register move and memmory load/store costs for dense math
            registers.
    
       10:  Make dense math registers a pressure class for register allocation.
    
       11:  Do not issue MMA deprime instructions if -mdense-math is in effect.
    
       12:  Add support for dense math registers to rs6000_split_multireg_move.
    
    The patches have been tested on both little and big endian systems.  Can I 
check
    it into the master branch?
    
    This is version 4 of the patches.  The previous patches were:
    
     * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707452.html
     * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707453.html
     * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707454.html
     * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707455.html
     * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707456.html
    
    gcc/
    
    2026-03-03   Michael Meissner  <[email protected]>
    
            * config/rs6000/mma.md (movxo): Convert to being a define_expand 
that
            can handle both the original MMA support without dense math 
registes,
            and adding dense math support.
            (movxo_nodm): Rename original movxo insn, and restrict this insn to 
when
            we do not have dense math registers.
            (movxo_dm): New define_insn_and_split for dense math registers.
            * config/rs6000/predicates.md (dense_math_operand): New predicate.
            (accumulator_operand): Add support for dense math registes.
            * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add dense math
            register support.
            (enum rs6000_reload_reg_typ): Likewise.
            (LAST_RELOAD_REG_CLASS): Likewise.
            (reload_reg_map): Likewise.
            (rs6000_reg_names): Likewise.
            (alt_reg_names): Likewise.
            (rs6000_hard_regno_nregs_internal): Likewise.
            (rs6000_hard_regno_mode_ok_uncached): Likewise.
            (rs6000_debug_reg_global): Likewise.
            (rs6000_setup_reg_addr_masks): Likewise.
            (rs6000_init_hard_regno_mode_ok): Likewise.
            (rs6000_option_override_internal): Likewise.
            (rs6000_secondary_reload_memory): Likewise.
            (rs6000_secondary_reload_simple_move): Likewise.
            (rs6000_preferred_reload_class): Likewise.
            (rs6000_secondary_reload_class): Likewise.
            (print_operand): Likewise.
            (rs6000_dense_math_register_move_cost): New helper function.
            (rs6000_register_move_cost): Add dense math register support.
            (rs6000_memory_move_cost): Likewise.
            (rs6000_compute_pressure_classes): Likewise.
            (rs6000_debugger_regno): Likewise.
            (rs6000_opt_masks): Likewise.
            (rs6000_split_multireg_move): Likewise.
            * config/rs6000/rs6000.h (UNITS_PER_DM_WORD): New macro.
            (FIRST_PSEUDO_REGISTER): Add dense math register support.
            (FIXED_REGISTERS): Likewise.
            (CALL_REALLY_USED_REGISTERS): Likewise.
            (REG_ALLOC_ORDER): Likewise.
            (DM_REGNO_P): New macro.
            (enum reg_class): Add dense math register support.
            (REG_CLASS_NAMES): Likewise.
            (REGISTER_NAMES): Likewise.
            (ADDITIONAL_REGISTER_NAMES): Likewise.
            * config/rs6000/rs6000.md (FIRST_DM_REGNO): New constant.
            (LAST_DM_REGNO): Likewise.

Diff:
---
 gcc/config/rs6000/mma.md        |  37 ++++++-
 gcc/config/rs6000/predicates.md |  26 ++++-
 gcc/config/rs6000/rs6000.cc     | 213 ++++++++++++++++++++++++++++++++--------
 gcc/config/rs6000/rs6000.h      |  37 ++++++-
 gcc/config/rs6000/rs6000.md     |   2 +
 5 files changed, 263 insertions(+), 52 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 77e7c633730c..1813adbecd31 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -313,7 +313,7 @@
    (set_attr "length" "*,*,8")])
 
 
-;; Vector quad support.  XOmode can only live in FPRs.
+;; Vector quad support.
 (define_expand "movxo"
   [(set (match_operand:XO 0 "nonimmediate_operand")
        (match_operand:XO 1 "input_operand"))]
@@ -338,10 +338,13 @@
     gcc_assert (false);
 })
 
-(define_insn_and_split "*movxo"
+;; If we do not have dense math registers, XOmode can only live in FPR
+;; registers (0..31).
+
+(define_insn_and_split "*movxo_nodm"
   [(set (match_operand:XO 0 "nonimmediate_operand" "=d,ZwO,d")
        (match_operand:XO 1 "input_operand" "ZwO,d,d"))]
-  "TARGET_MMA
+  "TARGET_MMA && !TARGET_DENSE_MATH
    && (gpc_reg_operand (operands[0], XOmode)
        || gpc_reg_operand (operands[1], XOmode))"
   "@
@@ -358,6 +361,34 @@
    (set_attr "length" "*,*,16")
    (set_attr "max_prefixed_insns" "2,2,*")])
 
+;; If dense math registers are available, XOmode can live in either VSX
+;; registers (0..63) or dense math registers.
+
+(define_insn_and_split "*movxo_dm"
+  [(set (match_operand:XO 0 "nonimmediate_operand" "=wa,ZwO,wa,wD,wD,wa")
+       (match_operand:XO 1 "input_operand"        "ZwO,wa, wa,wa,wD,wD"))]
+  "TARGET_DENSE_MATH
+   && (gpc_reg_operand (operands[0], XOmode)
+       || gpc_reg_operand (operands[1], XOmode))"
+  "@
+   #
+   #
+   #
+   dmxxinstdmr512 %0,%1,%Y1,0
+   dmmr %0,%1
+   dmxxextfdmr512 %0,%Y0,%1,0"
+  "&& reload_completed
+   && !dense_math_operand (operands[0], XOmode)
+   && !dense_math_operand (operands[1], XOmode)"
+  [(const_int 0)]
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "type" "vecload,vecstore,veclogical,mma,mma,mma")
+   (set_attr "length" "*,*,16,*,*,*")
+   (set_attr "max_prefixed_insns" "2,2,*,*,*,*")])
+
 (define_expand "vsx_assemble_pair"
   [(match_operand:OO 0 "vsx_register_operand")
    (match_operand:V16QI 1 "mma_assemble_input_operand")
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 682fd2dc6e85..5de81d54507b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -186,8 +186,26 @@
   return VLOGICAL_REGNO_P (REGNO (op));
 })
 
-;; Return 1 if op is an accumulator.  On power10 systems, the accumulators
-;; overlap with the FPRs.
+;; Return 1 if op is a dense math register
+(define_predicate "dense_math_operand"
+  (match_operand 0 "register_operand")
+{
+  if (!REG_P (op))
+    return 0;
+
+  if (!HARD_REGISTER_P (op))
+    return 1;
+
+  return DM_REGNO_P (REGNO (op));
+})
+
+;; Return 1 if op is an accumulator.
+;;
+;; On power10 and power11 systems, the accumulators overlap with the
+;; FPRs and the register must be divisible by 4.
+;;
+;; On systems with dense math registers, the accumulators are separate
+;; registers and do not overlap with the FPR registers.
 (define_predicate "accumulator_operand"
   (match_operand 0 "register_operand")
 {
@@ -201,7 +219,9 @@
     return 1;
 
   int r = REGNO (op);
-  return FP_REGNO_P (r) && (r & 3) == 0;
+  return (TARGET_DENSE_MATH
+         ? DM_REGNO_P (r)
+         : FP_REGNO_P (r) && (r & 3) == 0);
 })
 
 ;; Return 1 if op is the carry register.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 68d5e95179f7..2587c00301f7 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -292,7 +292,8 @@ enum rs6000_reg_type {
   ALTIVEC_REG_TYPE,
   FPR_REG_TYPE,
   SPR_REG_TYPE,
-  CR_REG_TYPE
+  CR_REG_TYPE,
+  DM_REG_TYPE
 };
 
 /* Map register class to register type.  */
@@ -306,22 +307,24 @@ static enum rs6000_reg_type 
reg_class_to_reg_type[N_REG_CLASSES];
 
 
 /* Register classes we care about in secondary reload or go if legitimate
-   address.  We only need to worry about GPR, FPR, and Altivec registers here,
-   along an ANY field that is the OR of the 3 register classes.  */
+   address.  We only need to worry about GPR, FPR, Altivec, and dense math
+   registers here, along an ANY field that is the OR of the 4 register
+   classes.  */
 
 enum rs6000_reload_reg_type {
   RELOAD_REG_GPR,                      /* General purpose registers.  */
   RELOAD_REG_FPR,                      /* Traditional floating point regs.  */
   RELOAD_REG_VMX,                      /* Altivec (VMX) registers.  */
-  RELOAD_REG_ANY,                      /* OR of GPR, FPR, Altivec masks.  */
+  RELOAD_REG_DMR,                      /* Dense math registers.  */
+  RELOAD_REG_ANY,                      /* OR of GPR/FPR/VMX/DMR masks.  */
   N_RELOAD_REG
 };
 
-/* For setting up register classes, loop through the 3 register classes mapping
+/* For setting up register classes, loop through the 4 register classes mapping
    into real registers, and skip the ANY class, which is just an OR of the
    bits.  */
 #define FIRST_RELOAD_REG_CLASS RELOAD_REG_GPR
-#define LAST_RELOAD_REG_CLASS  RELOAD_REG_VMX
+#define LAST_RELOAD_REG_CLASS  RELOAD_REG_DMR
 
 /* Map reload register type to a register in the register class.  */
 struct reload_reg_map_type {
@@ -333,6 +336,7 @@ static const struct reload_reg_map_type 
reload_reg_map[N_RELOAD_REG] = {
   { "Gpr",     FIRST_GPR_REGNO },      /* RELOAD_REG_GPR.  */
   { "Fpr",     FIRST_FPR_REGNO },      /* RELOAD_REG_FPR.  */
   { "VMX",     FIRST_ALTIVEC_REGNO },  /* RELOAD_REG_VMX.  */
+  { "Dmr",     FIRST_DM_REGNO },       /* RELOAD_REG_DMR.  */
   { "Any",     -1 },                   /* RELOAD_REG_ANY.  */
 };
 
@@ -1226,6 +1230,8 @@ char rs6000_reg_names[][8] =
       "0",  "1",  "2",  "3",  "4",  "5",  "6",  "7",
   /* vrsave vscr sfp */
       "vrsave", "vscr", "sfp",
+  /* dense math registers.  */
+      "0", "1", "2", "3", "4", "5", "6", "7",
 };
 
 #ifdef TARGET_REGNAMES
@@ -1252,6 +1258,8 @@ static const char alt_reg_names[][8] =
   "%cr0",  "%cr1", "%cr2", "%cr3", "%cr4", "%cr5", "%cr6", "%cr7",
   /* vrsave vscr sfp */
   "vrsave", "vscr", "sfp",
+  /* dense math registers.  */
+  "%dmr0", "%dmr1", "%dmr2", "%dmr3", "%dmr4", "%dmr5", "%dmr6", "%dmr7",
 };
 #endif
 
@@ -1842,6 +1850,9 @@ rs6000_hard_regno_nregs_internal (int regno, machine_mode 
mode)
   else if (ALTIVEC_REGNO_P (regno))
     reg_size = UNITS_PER_ALTIVEC_WORD;
 
+  else if (DM_REGNO_P (regno))
+    reg_size = UNITS_PER_DM_WORD;
+
   else
     reg_size = UNITS_PER_WORD;
 
@@ -1863,9 +1874,32 @@ rs6000_hard_regno_mode_ok_uncached (int regno, 
machine_mode mode)
   if (mode == OOmode)
     return (TARGET_MMA && VSX_REGNO_P (regno) && (regno & 1) == 0);
 
-  /* MMA accumulator modes need FPR registers divisible by 4.  */
+  /* On ISA 3.1 (power10), MMA accumulator modes need FPR registers divisible
+     by 4.
+
+     If dense math registers are enabled, we can allow all VSX registers plus
+     the dense math registers.  VSX registers are used to load and store the
+     registers as the accumulator registers do not have load and store
+     instructions.  Because we just use the VSX registers for load/store
+     operations, we just need to make sure load vector pair and store vector
+     pair instructions can be used.  */
   if (mode == XOmode)
-    return (TARGET_MMA && FP_REGNO_P (regno) && (regno & 3) == 0);
+    {
+      if (!TARGET_DENSE_MATH)
+       return (FP_REGNO_P (regno) && (regno & 3) == 0);
+
+      else if (DM_REGNO_P (regno))
+       return 1;
+
+      else
+       return (VSX_REGNO_P (regno)
+               && VSX_REGNO_P (last_regno)
+               && (regno & 1) == 0);
+    }
+
+  /* No other types other than XOmode can go in dense math registers.  */
+  if (DM_REGNO_P (regno))
+    return 0;
 
   /* PTImode can only go in GPRs.  Quad word memory operations require even/odd
      register combinations, and use PTImode where we need to deal with quad
@@ -2308,6 +2342,7 @@ rs6000_debug_reg_global (void)
   rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO,
                          LAST_ALTIVEC_REGNO,
                          "vs");
+  rs6000_debug_reg_print (FIRST_DM_REGNO, LAST_DM_REGNO, "dense_math");
   rs6000_debug_reg_print (LR_REGNO, LR_REGNO, "lr");
   rs6000_debug_reg_print (CTR_REGNO, CTR_REGNO, "ctr");
   rs6000_debug_reg_print (CR0_REGNO, CR7_REGNO, "cr");
@@ -2634,6 +2669,21 @@ rs6000_setup_reg_addr_masks (void)
          addr_mask = 0;
          reg = reload_reg_map[rc].reg;
 
+         /* Special case dense math registers.  */
+         if (rc == RELOAD_REG_DMR)
+           {
+             if (TARGET_DENSE_MATH && m2 == XOmode)
+               {
+                 addr_mask = RELOAD_REG_VALID;
+                 reg_addr[m].addr_mask[rc] = addr_mask;
+                 any_addr_mask |= addr_mask;
+               }
+             else
+               reg_addr[m].addr_mask[rc] = 0;
+
+             continue;
+           }
+
          /* Can mode values go in the GPR/FPR/Altivec registers?  */
          if (reg >= 0 && rs6000_hard_regno_mode_ok_p[m][reg])
            {
@@ -2784,6 +2834,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   for (r = CR1_REGNO; r <= CR7_REGNO; ++r)
     rs6000_regno_regclass[r] = CR_REGS;
 
+  for (r = FIRST_DM_REGNO; r <= LAST_DM_REGNO; ++r)
+    rs6000_regno_regclass[r] = DM_REGS;
+
   rs6000_regno_regclass[LR_REGNO] = LINK_REGS;
   rs6000_regno_regclass[CTR_REGNO] = CTR_REGS;
   rs6000_regno_regclass[CA_REGNO] = NO_REGS;
@@ -2808,6 +2861,7 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   reg_class_to_reg_type[(int)LINK_OR_CTR_REGS] = SPR_REG_TYPE;
   reg_class_to_reg_type[(int)CR_REGS] = CR_REG_TYPE;
   reg_class_to_reg_type[(int)CR0_REGS] = CR_REG_TYPE;
+  reg_class_to_reg_type[(int)DM_REGS] = DM_REG_TYPE;
 
   if (TARGET_VSX)
     {
@@ -2994,8 +3048,11 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   if (TARGET_DIRECT_MOVE_128)
     rs6000_constraints[RS6000_CONSTRAINT_we] = VSX_REGS;
 
+  /* Support for the accumulator registers, either FPR registers (aka original
+     mma) or dense math registers.  */
   if (TARGET_MMA)
-    rs6000_constraints[RS6000_CONSTRAINT_wD] = FLOAT_REGS;
+    rs6000_constraints[RS6000_CONSTRAINT_wD]
+      = TARGET_DENSE_MATH ? DM_REGS : FLOAT_REGS;
 
   /* Set up the reload helper and direct move functions.  */
   if (TARGET_VSX || TARGET_ALTIVEC)
@@ -12365,6 +12422,11 @@ rs6000_secondary_reload_memory (rtx addr,
     addr_mask = (reg_addr[mode].addr_mask[RELOAD_REG_VMX]
                 & ~RELOAD_REG_AND_M16);
 
+  /* Dense math registers use VSX registers for memory operations, and need to
+     generate some extra instructions.  */
+  else if (rclass == DM_REGS)
+    return 2;
+
   /* If the register allocator hasn't made up its mind yet on the register
      class to use, settle on defaults to use.  */
   else if (rclass == NO_REGS)
@@ -12693,6 +12755,13 @@ rs6000_secondary_reload_simple_move (enum 
rs6000_reg_type to_type,
               || (to_type == SPR_REG_TYPE && from_type == GPR_REG_TYPE)))
     return true;
 
+  /* We can transfer between VSX registers and dense math registers without
+     needing extra registers.  */
+  if (TARGET_DENSE_MATH && mode == XOmode
+      && ((to_type == DM_REG_TYPE && from_type == VSX_REG_TYPE)
+         || (to_type == VSX_REG_TYPE && from_type == DM_REG_TYPE)))
+    return true;
+
   return false;
 }
 
@@ -13387,6 +13456,10 @@ rs6000_preferred_reload_class (rtx x, enum reg_class 
rclass)
   machine_mode mode = GET_MODE (x);
   bool is_constant = CONSTANT_P (x);
 
+  /* Dense math registers can't be loaded or stored.  */
+  if (rclass == DM_REGS)
+    return NO_REGS;
+
   /* If a mode can't go in FPR/ALTIVEC/VSX registers, don't return a preferred
      reload class for it.  */
   if ((rclass == ALTIVEC_REGS || rclass == VSX_REGS)
@@ -13483,7 +13556,7 @@ rs6000_preferred_reload_class (rtx x, enum reg_class 
rclass)
        return VSX_REGS;
 
       if (mode == XOmode)
-       return FLOAT_REGS;
+       return TARGET_DENSE_MATH ? VSX_REGS : FLOAT_REGS;
 
       if (GET_MODE_CLASS (mode) == MODE_INT)
        return GENERAL_REGS;
@@ -13608,6 +13681,11 @@ rs6000_secondary_reload_class (enum reg_class rclass, 
machine_mode mode,
   else
     regno = -1;
 
+  /* Dense math registers don't have loads or stores.  We have to go through
+     the VSX registers to load XOmode (vector quad).  */
+  if (TARGET_DENSE_MATH && rclass == DM_REGS)
+    return VSX_REGS;
+
   /* If we have VSX register moves, prefer moving scalar values between
      Altivec registers and GPR by going via an FPR (and then via memory)
      instead of reloading the secondary memory address for Altivec moves.  */
@@ -14139,8 +14217,14 @@ print_operand (FILE *file, rtx x, int code)
         output_operand.  */
 
     case 'A':
-      /* Write the MMA accumulator number associated with VSX register X.  */
-      if (!REG_P (x) || !FP_REGNO_P (REGNO (x)) || (REGNO (x) % 4) != 0)
+      /* Write the MMA accumulator number associated with VSX register X.  On
+        dense math systems, only allow dense math accumulators, not
+        accumulators overlapping with the FPR registers.  */
+      if (!REG_P (x))
+       output_operand_lossage ("invalid %%A value");
+      else if (TARGET_DENSE_MATH && DM_REGNO_P (REGNO (x)))
+       fprintf (file, "%d", REGNO (x) - FIRST_DM_REGNO);
+      else if (!FP_REGNO_P (REGNO (x)) || (REGNO (x) % 4) != 0)
        output_operand_lossage ("invalid %%A value");
       else
        fprintf (file, "%d", (REGNO (x) - FIRST_FPR_REGNO) / 4);
@@ -22760,6 +22844,31 @@ rs6000_debug_address_cost (rtx x, machine_mode mode,
 }
 
 
+/* Subroutine to determine the move cost of dense math registers.  If we are
+   moving to/from VSX_REGISTER registers, the cost is either 1 move (for
+   512-bit accumulators) or 2 moves (for 1,024 dense math registers).  If we 
are
+   moving to anything else like GPR registers, make the cost very high.  */
+
+static int
+rs6000_dense_math_register_move_cost (machine_mode mode, reg_class_t rclass)
+{
+  const int reg_move_base = 2;
+  HARD_REG_SET vsx_set = (reg_class_contents[rclass]
+                         & reg_class_contents[VSX_REGS]);
+
+  if (TARGET_DENSE_MATH && !hard_reg_set_empty_p (vsx_set))
+    {
+      /* __vector_quad (i.e. XOmode) is tranfered in 1 instruction.  */
+      if (mode == XOmode)
+       return reg_move_base;
+
+      else
+       return reg_move_base * 2 * hard_regno_nregs (FIRST_DM_REGNO, mode);
+    }
+
+  return 1000 * 2 * hard_regno_nregs (FIRST_DM_REGNO, mode);
+}
+
 /* A C expression returning the cost of moving data from a register of class
    CLASS1 to one of CLASS2.  */
 
@@ -22773,17 +22882,28 @@ rs6000_register_move_cost (machine_mode mode,
   if (TARGET_DEBUG_COST)
     dbg_cost_ctrl++;
 
+  HARD_REG_SET to_vsx, from_vsx;
+  to_vsx = reg_class_contents[to] & reg_class_contents[VSX_REGS];
+  from_vsx = reg_class_contents[from] & reg_class_contents[VSX_REGS];
+
+  /* Special case dense math registers, that can only move to/from VSX 
registers.  */
+  if (from == DM_REGS && to == DM_REGS)
+    ret = 2 * hard_regno_nregs (FIRST_DM_REGNO, mode);
+
+  else if (from == DM_REGS)
+    ret = rs6000_dense_math_register_move_cost (mode, to);
+
+  else if (to == DM_REGS)
+    ret = rs6000_dense_math_register_move_cost (mode, from);
+
   /* If we have VSX, we can easily move between FPR or Altivec registers,
      otherwise we can only easily move within classes.
      Do this first so we give best-case answers for union classes
      containing both gprs and vsx regs.  */
-  HARD_REG_SET to_vsx, from_vsx;
-  to_vsx = reg_class_contents[to] & reg_class_contents[VSX_REGS];
-  from_vsx = reg_class_contents[from] & reg_class_contents[VSX_REGS];
-  if (!hard_reg_set_empty_p (to_vsx)
-      && !hard_reg_set_empty_p (from_vsx)
-      && (TARGET_VSX
-         || hard_reg_set_intersect_p (to_vsx, from_vsx)))
+  else if (!hard_reg_set_empty_p (to_vsx)
+          && !hard_reg_set_empty_p (from_vsx)
+          && (TARGET_VSX
+              || hard_reg_set_intersect_p (to_vsx, from_vsx)))
     {
       int reg = FIRST_FPR_REGNO;
       if (TARGET_VSX
@@ -22879,6 +22999,9 @@ rs6000_memory_move_cost (machine_mode mode, reg_class_t 
rclass,
     ret = 4 * hard_regno_nregs (32, mode);
   else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS))
     ret = 4 * hard_regno_nregs (FIRST_ALTIVEC_REGNO, mode);
+  else if (reg_classes_intersect_p (rclass, DM_REGS))
+    ret = (rs6000_dense_math_register_move_cost (mode, VSX_REGS)
+          + rs6000_memory_move_cost (mode, VSX_REGS, false));
   else
     ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
 
@@ -24087,6 +24210,8 @@ rs6000_compute_pressure_classes (enum reg_class 
*pressure_classes)
       if (TARGET_HARD_FLOAT)
        pressure_classes[n++] = FLOAT_REGS;
     }
+  if (TARGET_DENSE_MATH)
+    pressure_classes[n++] = DM_REGS;
   pressure_classes[n++] = CR_REGS;
   pressure_classes[n++] = SPECIAL_REGS;
 
@@ -24251,6 +24376,10 @@ rs6000_debugger_regno (unsigned int regno, unsigned 
int format)
     return 67;
   if (regno == 64)
     return 64;
+  /* XXX: This is a guess.  The GCC register number for FIRST_DM_REGNO is 111,
+     but the frame pointer regnum uses that.  */
+  if (DM_REGNO_P (regno))
+    return regno - FIRST_DM_REGNO + 112;
 
   gcc_unreachable ();
 }
@@ -27490,9 +27619,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
          unsigned offset = 0;
          unsigned size = GET_MODE_SIZE (reg_mode);
 
-         /* If we are reading an accumulator register, we have to
-            deprime it before we can access it.  */
-         if (TARGET_MMA
+         /* If we are reading an accumulator register, we have to deprime it
+            before we can access it unless we have dense math registers.  */
+         if (TARGET_MMA && !TARGET_DENSE_MATH
              && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src)))
            emit_insn (gen_mma_xxmfacc (src, src));
 
@@ -27524,9 +27653,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
              emit_insn (gen_rtx_SET (dst2, src2));
            }
 
-         /* If we are writing an accumulator register, we have to
-            prime it after we've written it.  */
-         if (TARGET_MMA
+         /* If we are writing an accumulator register, we have to prime it
+            after we've written it unless we have dense math registers.  */
+         if (TARGET_MMA && !TARGET_DENSE_MATH
              && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst)))
            emit_insn (gen_mma_xxmtacc (dst, dst));
 
@@ -27540,7 +27669,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
                      || XINT (src, 1) == UNSPECV_MMA_ASSEMBLE);
          gcc_assert (REG_P (dst));
          if (GET_MODE (src) == XOmode)
-           gcc_assert (FP_REGNO_P (REGNO (dst)));
+           gcc_assert ((TARGET_DENSE_MATH
+                        ? VSX_REGNO_P (REGNO (dst))
+                        : FP_REGNO_P (REGNO (dst))));
          if (GET_MODE (src) == OOmode)
            gcc_assert (VSX_REGNO_P (REGNO (dst)));
 
@@ -27593,9 +27724,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
              emit_insn (gen_rtx_SET (dst_i, op));
            }
 
-         /* We are writing an accumulator register, so we have to
-            prime it after we've written it.  */
-         if (GET_MODE (src) == XOmode)
+         /* We are writing an accumulator register, so we have to prime it
+            after we've written it unless we have dense math registers.  */
+         if (GET_MODE (src) == XOmode && !TARGET_DENSE_MATH)
            emit_insn (gen_mma_xxmtacc (dst, dst));
 
          return;
@@ -27606,9 +27737,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
 
   if (REG_P (src) && REG_P (dst) && (REGNO (src) < REGNO (dst)))
     {
-      /* If we are reading an accumulator register, we have to
-        deprime it before we can access it.  */
-      if (TARGET_MMA
+      /* If we are reading an accumulator register, we have to deprime it
+        before we can access it unless we have dense math registers.  */
+      if (TARGET_MMA && !TARGET_DENSE_MATH
          && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src)))
        emit_insn (gen_mma_xxmfacc (src, src));
 
@@ -27634,9 +27765,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
                                                         i * reg_mode_size)));
        }
 
-      /* If we are writing an accumulator register, we have to
-        prime it after we've written it.  */
-      if (TARGET_MMA
+      /* If we are writing an accumulator register, we have to prime it after
+        we've written it unless we have dense math registers.  */
+      if (TARGET_MMA && !TARGET_DENSE_MATH
          && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst)))
        emit_insn (gen_mma_xxmtacc (dst, dst));
     }
@@ -27771,9 +27902,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
            gcc_assert (rs6000_offsettable_memref_p (dst, reg_mode, true));
        }
 
-      /* If we are reading an accumulator register, we have to
-        deprime it before we can access it.  */
-      if (TARGET_MMA && REG_P (src)
+      /* If we are reading an accumulator register, we have to deprime it
+        before we can access it unless we have dense math registers.  */
+      if (TARGET_MMA && !TARGET_DENSE_MATH && REG_P (src)
          && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src)))
        emit_insn (gen_mma_xxmfacc (src, src));
 
@@ -27803,9 +27934,9 @@ rs6000_split_multireg_move (rtx dst, rtx src)
                                                         j * reg_mode_size)));
        }
 
-      /* If we are writing an accumulator register, we have to
-        prime it after we've written it.  */
-      if (TARGET_MMA && REG_P (dst)
+      /* If we are writing an accumulator register, we have to prime it after
+        we've written it unless we have dense math registers.  */
+      if (TARGET_MMA && !TARGET_DENSE_MATH && REG_P (dst)
          && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst)))
        emit_insn (gen_mma_xxmtacc (dst, dst));
 
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 04709f0dcd6e..5214a7c22cea 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -653,6 +653,7 @@ extern unsigned char rs6000_recip_bits[];
 #define UNITS_PER_FP_WORD 8
 #define UNITS_PER_ALTIVEC_WORD 16
 #define UNITS_PER_VSX_WORD 16
+#define UNITS_PER_DM_WORD 128
 
 /* Type used for ptrdiff_t, as a string used in a declaration.  */
 #define PTRDIFF_TYPE "int"
@@ -766,7 +767,7 @@ enum data_align { align_abi, align_opt, align_both };
    Another pseudo (not included in DWARF_FRAME_REGISTERS) is soft frame
    pointer, which is eventually eliminated in favor of SP or FP.  */
 
-#define FIRST_PSEUDO_REGISTER 111
+#define FIRST_PSEUDO_REGISTER 119
 
 /* Use standard DWARF numbering for DWARF debugging information.  */
 #define DEBUGGER_REGNO(REGNO) rs6000_debugger_regno ((REGNO), 0)
@@ -803,7 +804,9 @@ enum data_align { align_abi, align_opt, align_both };
    /* cr0..cr7 */                                 \
    0, 0, 0, 0, 0, 0, 0, 0,                        \
    /* vrsave vscr sfp */                          \
-   1, 1, 1                                        \
+   1, 1, 1,                                       \
+   /* Dense math registers.  */                           \
+   0, 0, 0, 0, 0, 0, 0, 0                         \
 }
 
 /* Like `CALL_USED_REGISTERS' except this macro doesn't require that
@@ -827,7 +830,9 @@ enum data_align { align_abi, align_opt, align_both };
    /* cr0..cr7 */                                 \
    1, 1, 0, 0, 0, 1, 1, 1,                        \
    /* vrsave vscr sfp */                          \
-   0, 0, 0                                        \
+   0, 0, 0,                                       \
+   /* Dense math registers.  */                           \
+   0, 0, 0, 0, 0, 0, 0, 0                         \
 }
 
 #define TOTAL_ALTIVEC_REGS     (LAST_ALTIVEC_REGNO - FIRST_ALTIVEC_REGNO + 1)
@@ -864,6 +869,7 @@ enum data_align { align_abi, align_opt, align_both };
        v2              (not saved; incoming vector arg reg; return value)
        v19 - v14       (not saved or used for anything)
        v31 - v20       (saved; order given to save least number)
+       dmr0 - dmr7     (not saved)
        vrsave, vscr    (fixed)
        sfp             (fixed)
 */
@@ -906,6 +912,9 @@ enum data_align { align_abi, align_opt, align_both };
    66,                                                         \
    83, 82, 81, 80, 79, 78,                                     \
    95, 94, 93, 92, 91, 90, 89, 88, 87, 86, 85, 84,             \
+   /* Dense math registers.  */                                        \
+   111, 112, 113, 114, 115, 116, 117, 118,                     \
+   /* Vrsave, vscr, sfp.  */                                   \
    108, 109,                                                   \
    110                                                         \
 }
@@ -932,6 +941,9 @@ enum data_align { align_abi, align_opt, align_both };
 /* True if register is a VSX register.  */
 #define VSX_REGNO_P(N) (FP_REGNO_P (N) || ALTIVEC_REGNO_P (N))
 
+/* True if register is a Dense math register.  */
+#define DM_REGNO_P(N)  ((N) >= FIRST_DM_REGNO && (N) <= LAST_DM_REGNO)
+
 /* Alternate name for any vector register supporting floating point, no matter
    which instruction set(s) are available.  */
 #define VFLOAT_REGNO_P(N) \
@@ -1069,6 +1081,7 @@ enum reg_class
   FLOAT_REGS,
   ALTIVEC_REGS,
   VSX_REGS,
+  DM_REGS,
   VRSAVE_REGS,
   VSCR_REGS,
   GEN_OR_FLOAT_REGS,
@@ -1098,6 +1111,7 @@ enum reg_class
   "FLOAT_REGS",                                                                
\
   "ALTIVEC_REGS",                                                      \
   "VSX_REGS",                                                          \
+  "DM_REGS",                                                           \
   "VRSAVE_REGS",                                                       \
   "VSCR_REGS",                                                         \
   "GEN_OR_FLOAT_REGS",                                                 \
@@ -1132,6 +1146,8 @@ enum reg_class
   { 0x00000000, 0x00000000, 0xffffffff, 0x00000000 },                  \
   /* VSX_REGS.  */                                                     \
   { 0x00000000, 0xffffffff, 0xffffffff, 0x00000000 },                  \
+  /* DM_REGS.  */                                                      \
+  { 0x00000000, 0x00000000, 0x00000000, 0x007f8000 },                  \
   /* VRSAVE_REGS.  */                                                  \
   { 0x00000000, 0x00000000, 0x00000000, 0x00001000 },                  \
   /* VSCR_REGS.  */                                                    \
@@ -1159,7 +1175,7 @@ enum reg_class
   /* CA_REGS.  */                                                      \
   { 0x00000000, 0x00000000, 0x00000000, 0x00000004 },                  \
   /* ALL_REGS.  */                                                     \
-  { 0xffffffff, 0xffffffff, 0xffffffff, 0x00007fff }                   \
+  { 0xffffffff, 0xffffffff, 0xffffffff, 0x007fffff }                   \
 }
 
 /* The same information, inverted:
@@ -2060,7 +2076,16 @@ extern char rs6000_reg_names[][8];       /* register 
names (0 vs. %r0).  */
   &rs6000_reg_names[108][0],   /* vrsave  */                           \
   &rs6000_reg_names[109][0],   /* vscr  */                             \
                                                                        \
-  &rs6000_reg_names[110][0]    /* sfp  */                              \
+  &rs6000_reg_names[110][0],   /* sfp  */                              \
+                                                                       \
+  &rs6000_reg_names[111][0],   /* dmr0  */                             \
+  &rs6000_reg_names[112][0],   /* dmr1  */                             \
+  &rs6000_reg_names[113][0],   /* dmr2  */                             \
+  &rs6000_reg_names[114][0],   /* dmr3  */                             \
+  &rs6000_reg_names[115][0],   /* dmr4  */                             \
+  &rs6000_reg_names[116][0],   /* dmr5  */                             \
+  &rs6000_reg_names[117][0],   /* dmr6  */                             \
+  &rs6000_reg_names[118][0],   /* dmr7  */                             \
 }
 
 /* Table of additional register names to use in user input.  */
@@ -2114,6 +2139,8 @@ extern char rs6000_reg_names[][8];        /* register 
names (0 vs. %r0).  */
   {"vs52", 84}, {"vs53", 85}, {"vs54", 86}, {"vs55", 87},      \
   {"vs56", 88}, {"vs57", 89}, {"vs58", 90}, {"vs59", 91},      \
   {"vs60", 92}, {"vs61", 93}, {"vs62", 94}, {"vs63", 95},      \
+  {"dmr0", 111}, {"dmr1", 112}, {"dmr2", 113}, {"dmr3", 114},  \
+  {"dmr4", 115}, {"dmr5", 116}, {"dmr6", 117}, {"dmr7", 118},  \
 }
 
 /* This is how to output an element of a case-vector that is relative.  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3089551552c8..57a239791ee3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -51,6 +51,8 @@
    (VRSAVE_REGNO               108)
    (VSCR_REGNO                 109)
    (FRAME_POINTER_REGNUM       110)
+   (FIRST_DM_REGNO             111)
+   (LAST_DM_REGNO              118)
   ])
 
 ;;

[gcc(refs/vendors/ibm/heads/mmaplus2)] Add support for dense math registers.

Reply via email to