PR 67211 is an error that shows up on the GCC 5.x branch when the test case is
compiled with -mcpu=power7 -mtune=power8 -O3. In looking at the code, I noticed
that the code optimized adjancent 64-bit integer/pointers in a structure from
DImode to V2DImode. The compiler optimized these to the vector registers, and
then tried to move a common field used later back to the GPR field. If the cpu
was power8, it would be able to use the direct move instructions, but on power7
those instructions don't exist.  The current trunk compiler has dialed back on
the optimization, and it no longer tries to optimize adjacent fields in this
particular case to V2DImode, but it is an issue in the GCC 5 branch.

In debugging the issue, I noticed the -mefficient-unaligned-VSX option was
being set if -mtune=power8 was used, even if the architecture was not a
power8. Efficient unaligned VSX is an architecture feature, and not a tuning
feature. In fixing this to be an architecture feature, it no longer tried to do
the V2DImode optimization because it didn't have fast unaligned support.

I have checked this on a big endian power7 and a little endian power8 system,
using the GCC 5.x patches and the patches for the trunk.  There were no
regressions in any of the runs.  Is it ok to install these patches on both the
GCC 5.x branch and trunk?

I would like to commit a similar patch for the 4.9 branch as well. Is this ok?

Note, due to rs6000.opt being slightly different between GCC 5.x and trunk,
there are two different patches, one for GCC 5.x and the other for GCC 6.x
(trunk).

[gcc]
2015-08-20  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        PR target/67211
        * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Set
        -mefficient-unaligned-vsx on ISA 2.7.

        * config/rs6000/rs6000.opt (-mefficient-unaligned-vsx): Convert
        option to a masked option.

        * config/rs6000/rs6000.c (rs6000_option_override_internal): Rework
        logic for -mefficient-unaligned-vsx so that it is set via an arch
        ISA option, instead of being set if -mtune=power8 is set. Move
        -mefficient-unaligned-vsx and -mallow-movmisalign handling to be
        near other default option handling.

[gcc/testsuite]
2015-08-20  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        PR target/67211
        * g++.dg/pr67211.C: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def   (revision 226986)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -53,6 +53,7 @@
                                 | OPTION_MASK_P8_VECTOR                \
                                 | OPTION_MASK_CRYPTO                   \
                                 | OPTION_MASK_DIRECT_MOVE              \
+                                | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
                                 | OPTION_MASK_HTM                      \
                                 | OPTION_MASK_QUAD_MEMORY              \
                                 | OPTION_MASK_QUAD_MEMORY_ATOMIC       \
@@ -78,6 +79,7 @@
                                 | OPTION_MASK_DFP                      \
                                 | OPTION_MASK_DIRECT_MOVE              \
                                 | OPTION_MASK_DLMZB                    \
+                                | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
                                 | OPTION_MASK_FPRND                    \
                                 | OPTION_MASK_HTM                      \
                                 | OPTION_MASK_ISEL                     \
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt        (revision 226986)
+++ gcc/config/rs6000/rs6000.opt        (working copy)
@@ -212,7 +212,7 @@ Target Undocumented Var(TARGET_ALLOW_MOV
 ; Allow/disallow the movmisalign in DF/DI vectors
 
 mefficient-unaligned-vector
-Target Undocumented Report Var(TARGET_EFFICIENT_UNALIGNED_VSX) Init(-1) Save
+Target Undocumented Report Mask(EFFICIENT_UNALIGNED_VSX) Var(rs6000_isa_flags)
 ; Consider unaligned VSX accesses to be efficient/inefficient
 
 mallow-df-permute
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 226986)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3692,6 +3692,45 @@ rs6000_option_override_internal (bool gl
       && optimize >= 3)
     rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN;
 
+  /* Set -mallow-movmisalign to explicitly on if we have full ISA 2.07
+     support. If we only have ISA 2.06 support, and the user did not specify
+     the switch, leave it set to -1 so the movmisalign patterns are enabled,
+     but we don't enable the full vectorization support  */
+  if (TARGET_ALLOW_MOVMISALIGN == -1 && TARGET_P8_VECTOR && TARGET_DIRECT_MOVE)
+    TARGET_ALLOW_MOVMISALIGN = 1;
+
+  else if (TARGET_ALLOW_MOVMISALIGN && !TARGET_VSX)
+    {
+      if (TARGET_ALLOW_MOVMISALIGN > 0)
+       error ("-mallow-movmisalign requires -mvsx");
+
+      TARGET_ALLOW_MOVMISALIGN = 0;
+    }
+
+  /* Determine when unaligned vector accesses are permitted, and when
+     they are preferred over masked Altivec loads.  Note that if
+     TARGET_ALLOW_MOVMISALIGN has been disabled by the user, then
+     TARGET_EFFICIENT_UNALIGNED_VSX must be as well.  The converse is
+     not true.  */
+  if (TARGET_EFFICIENT_UNALIGNED_VSX)
+    {
+      if (!TARGET_VSX)
+       {
+         if (rs6000_isa_flags_explicit & OPTION_MASK_EFFICIENT_UNALIGNED_VSX)
+           error ("-mefficient-unaligned-vsx requires -mvsx");
+
+         rs6000_isa_flags &= ~OPTION_MASK_EFFICIENT_UNALIGNED_VSX;
+       }
+
+      else if (!TARGET_ALLOW_MOVMISALIGN)
+       {
+         if (rs6000_isa_flags_explicit & OPTION_MASK_EFFICIENT_UNALIGNED_VSX)
+           error ("-mefficient-unaligned-vsx requires -mallow-movmisalign");
+
+         rs6000_isa_flags &= ~OPTION_MASK_EFFICIENT_UNALIGNED_VSX;
+       }
+    }
+
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
 
@@ -4251,22 +4290,6 @@ rs6000_option_override_internal (bool gl
        }
     }
 
-  /* Determine when unaligned vector accesses are permitted, and when
-     they are preferred over masked Altivec loads.  Note that if
-     TARGET_ALLOW_MOVMISALIGN has been disabled by the user, then
-     TARGET_EFFICIENT_UNALIGNED_VSX must be as well.  The converse is
-     not true.  */
-  if (TARGET_EFFICIENT_UNALIGNED_VSX == -1) {
-    if (TARGET_VSX && rs6000_cpu == PROCESSOR_POWER8
-       && TARGET_ALLOW_MOVMISALIGN != 0)
-      TARGET_EFFICIENT_UNALIGNED_VSX = 1;
-    else
-      TARGET_EFFICIENT_UNALIGNED_VSX = 0;
-  }
-
-  if (TARGET_ALLOW_MOVMISALIGN == -1 && rs6000_cpu == PROCESSOR_POWER8)
-    TARGET_ALLOW_MOVMISALIGN = 1;
-
   /* Set the builtin mask of the various options used that could affect which
      builtins were used.  In the past we used target_flags, but we've run out
      of bits, and some options like SPE and PAIRED are no longer in
@@ -32280,6 +32303,8 @@ static struct rs6000_opt_mask const rs60
   { "crypto",                  OPTION_MASK_CRYPTO,             false, true  },
   { "direct-move",             OPTION_MASK_DIRECT_MOVE,        false, true  },
   { "dlmzb",                   OPTION_MASK_DLMZB,              false, true  },
+  { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
+                                                               false, true  },
   { "fprnd",                   OPTION_MASK_FPRND,              false, true  },
   { "hard-dfp",                        OPTION_MASK_DFP,                false, 
true  },
   { "htm",                     OPTION_MASK_HTM,                false, true  },
Index: gcc/testsuite/g++.dg/pr67211.C
===================================================================
--- gcc/testsuite/g++.dg/pr67211.C      (revision 0)
+++ gcc/testsuite/g++.dg/pr67211.C      (revision 0)
@@ -0,0 +1,50 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
+/* { dg-options "-mcpu=power7 -mtune=power8 -O3 -w" } */
+
+/* target/67211, compiler got a 'insn does not satisfy its constraints' error. 
 */
+
+template <typename _InputIterator, typename _ForwardIterator>
+void find_first_of(_InputIterator, _InputIterator, _ForwardIterator p3,
+                   _ForwardIterator p4) {
+  for (; p3 != p4; ++p3)
+    ;
+}
+
+template <typename, typename, typename> struct A {
+  int _S_buffer_size;
+  int *_M_cur;
+  int *_M_first;
+  int *_M_last;
+  int **_M_node;
+  void operator++() {
+    if (_M_cur == _M_last)
+      m_fn1(_M_node + 1);
+  }
+  void m_fn1(int **p1) {
+    _M_node = p1;
+    _M_first = *p1;
+    _M_last = _M_first + _S_buffer_size;
+  }
+};
+
+template <typename _Tp, typename _Ref, typename _Ptr>
+bool operator==(A<_Tp, _Ref, _Ptr>, A<_Tp, _Ref, _Ptr>);
+template <typename _Tp, typename _Ref, typename _Ptr>
+bool operator!=(A<_Tp, _Ref, _Ptr> p1, A<_Tp, _Ref, _Ptr> p2) {
+  return p1 == p2;
+}
+
+class B {
+public:
+  A<int, int, int> m_fn2();
+};
+struct {
+  B j;
+} a;
+void Linked() {
+  A<int, int, int> b, c, d;
+  find_first_of(d, c, b, a.j.m_fn2());
+}
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def   (revision 227016)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -53,6 +53,7 @@
                                 | OPTION_MASK_P8_VECTOR                \
                                 | OPTION_MASK_CRYPTO                   \
                                 | OPTION_MASK_DIRECT_MOVE              \
+                                | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
                                 | OPTION_MASK_HTM                      \
                                 | OPTION_MASK_QUAD_MEMORY              \
                                 | OPTION_MASK_QUAD_MEMORY_ATOMIC       \
@@ -78,6 +79,7 @@
                                 | OPTION_MASK_DFP                      \
                                 | OPTION_MASK_DIRECT_MOVE              \
                                 | OPTION_MASK_DLMZB                    \
+                                | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
                                 | OPTION_MASK_FPRND                    \
                                 | OPTION_MASK_HTM                      \
                                 | OPTION_MASK_ISEL                     \
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt        (revision 227016)
+++ gcc/config/rs6000/rs6000.opt        (working copy)
@@ -212,7 +212,7 @@
 ; Allow/disallow the movmisalign in DF/DI vectors
 
 mefficient-unaligned-vector
-Target Undocumented Report Var(TARGET_EFFICIENT_UNALIGNED_VSX) Init(-1)
+Target Undocumented Report Mask(EFFICIENT_UNALIGNED_VSX) Var(rs6000_isa_flags)
 ; Consider unaligned VSX accesses to be efficient/inefficient
 
 mallow-df-permute
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 227016)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3716,6 +3716,45 @@
   else if (TARGET_FLOAT128 == FLOAT128_SW && !TARGET_VSX)
     error ("-mfloat128-software requires VSX support");
 
+  /* Set -mallow-movmisalign to explicitly on if we have full ISA 2.07
+     support. If we only have ISA 2.06 support, and the user did not specify
+     the switch, leave it set to -1 so the movmisalign patterns are enabled,
+     but we don't enable the full vectorization support  */
+  if (TARGET_ALLOW_MOVMISALIGN == -1 && TARGET_P8_VECTOR && TARGET_DIRECT_MOVE)
+    TARGET_ALLOW_MOVMISALIGN = 1;
+
+  else if (TARGET_ALLOW_MOVMISALIGN && !TARGET_VSX)
+    {
+      if (TARGET_ALLOW_MOVMISALIGN > 0)
+       error ("-mallow-movmisalign requires -mvsx");
+
+      TARGET_ALLOW_MOVMISALIGN = 0;
+    }
+
+  /* Determine when unaligned vector accesses are permitted, and when
+     they are preferred over masked Altivec loads.  Note that if
+     TARGET_ALLOW_MOVMISALIGN has been disabled by the user, then
+     TARGET_EFFICIENT_UNALIGNED_VSX must be as well.  The converse is
+     not true.  */
+  if (TARGET_EFFICIENT_UNALIGNED_VSX)
+    {
+      if (!TARGET_VSX)
+       {
+         if (rs6000_isa_flags_explicit & OPTION_MASK_EFFICIENT_UNALIGNED_VSX)
+           error ("-mefficient-unaligned-vsx requires -mvsx");
+
+         rs6000_isa_flags &= ~OPTION_MASK_EFFICIENT_UNALIGNED_VSX;
+       }
+
+      else if (!TARGET_ALLOW_MOVMISALIGN)
+       {
+         if (rs6000_isa_flags_explicit & OPTION_MASK_EFFICIENT_UNALIGNED_VSX)
+           error ("-mefficient-unaligned-vsx requires -mallow-movmisalign");
+
+         rs6000_isa_flags &= ~OPTION_MASK_EFFICIENT_UNALIGNED_VSX;
+       }
+    }
+
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
 
@@ -4275,22 +4314,6 @@
        }
     }
 
-  /* Determine when unaligned vector accesses are permitted, and when
-     they are preferred over masked Altivec loads.  Note that if
-     TARGET_ALLOW_MOVMISALIGN has been disabled by the user, then
-     TARGET_EFFICIENT_UNALIGNED_VSX must be as well.  The converse is
-     not true.  */
-  if (TARGET_EFFICIENT_UNALIGNED_VSX == -1) {
-    if (TARGET_VSX && rs6000_cpu == PROCESSOR_POWER8
-       && TARGET_ALLOW_MOVMISALIGN != 0)
-      TARGET_EFFICIENT_UNALIGNED_VSX = 1;
-    else
-      TARGET_EFFICIENT_UNALIGNED_VSX = 0;
-  }
-
-  if (TARGET_ALLOW_MOVMISALIGN == -1 && rs6000_cpu == PROCESSOR_POWER8)
-    TARGET_ALLOW_MOVMISALIGN = 1;
-
   /* Set the builtin mask of the various options used that could affect which
      builtins were used.  In the past we used target_flags, but we've run out
      of bits, and some options like SPE and PAIRED are no longer in
@@ -32921,6 +32944,8 @@
   { "crypto",                  OPTION_MASK_CRYPTO,             false, true  },
   { "direct-move",             OPTION_MASK_DIRECT_MOVE,        false, true  },
   { "dlmzb",                   OPTION_MASK_DLMZB,              false, true  },
+  { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
+                                                               false, true  },
   { "fprnd",                   OPTION_MASK_FPRND,              false, true  },
   { "hard-dfp",                        OPTION_MASK_DFP,                false, 
true  },
   { "htm",                     OPTION_MASK_HTM,                false, true  },
Index: gcc/testsuite/g++.dg/pr67211.C
===================================================================
--- gcc/testsuite/g++.dg/pr67211.C      (revision 0)
+++ gcc/testsuite/g++.dg/pr67211.C      (revision 0)
@@ -0,0 +1,50 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
+/* { dg-options "-mcpu=power7 -mtune=power8 -O3 -w" } */
+
+/* target/67211, compiler got a 'insn does not satisfy its constraints' error. 
 */
+
+template <typename _InputIterator, typename _ForwardIterator>
+void find_first_of(_InputIterator, _InputIterator, _ForwardIterator p3,
+                   _ForwardIterator p4) {
+  for (; p3 != p4; ++p3)
+    ;
+}
+
+template <typename, typename, typename> struct A {
+  int _S_buffer_size;
+  int *_M_cur;
+  int *_M_first;
+  int *_M_last;
+  int **_M_node;
+  void operator++() {
+    if (_M_cur == _M_last)
+      m_fn1(_M_node + 1);
+  }
+  void m_fn1(int **p1) {
+    _M_node = p1;
+    _M_first = *p1;
+    _M_last = _M_first + _S_buffer_size;
+  }
+};
+
+template <typename _Tp, typename _Ref, typename _Ptr>
+bool operator==(A<_Tp, _Ref, _Ptr>, A<_Tp, _Ref, _Ptr>);
+template <typename _Tp, typename _Ref, typename _Ptr>
+bool operator!=(A<_Tp, _Ref, _Ptr> p1, A<_Tp, _Ref, _Ptr> p2) {
+  return p1 == p2;
+}
+
+class B {
+public:
+  A<int, int, int> m_fn2();
+};
+struct {
+  B j;
+} a;
+void Linked() {
+  A<int, int, int> b, c, d;
+  find_first_of(d, c, b, a.j.m_fn2());
+}

Reply via email to