https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80265

--- Comment #32 from Pedro Alves <palves at redhat dot com> ---
In that case, I think the "something something" you're looking for can be 
char_trait's __constant_char_array_p:

+             if (__constant_char_array_p(__first1, __len)
+                 && __constant_char_array_p(__first2, __len))
+               return __equal<false>::equal(__first1, __last1, __first2);

I've done a quick test here, and it works AFAICT.  Below's a standalone and
slightly simplified version of std::equal with that change.  The static asserts
don't fail, and the result at run time looks right too.

At -O0, the (runtime calls to) std::equal stay around, but AFAICT, there's no
code generated for the __builtin_constant_p checks/loop in
__constant_char_array_p.

Disassembling the generated code at -O2 I see that g++ fully optimizes out the
equal1/equal2/equal2 calls to constant "1".  

Interestingly, the trick makes g++ optimize the test _better_ than the same
code _without_ the __builtin_constant_p (i.e, better than current trunk). 
Without the __builtin_char_array_p detour, we end up with calls to memcmp in
the generated code instead of constant folding out everything.  (you'll need to
comment out the static_assert calls too to try that out, of course.)
ISTR having observed something like this this too with the char_traits changes,
though I'm not sure I ever mentioned it.

I've tried to increase the size of the arrays in the test to check whether
that'd fool the optimizers, but it still works.  If you increase the size
enough, you'll hit the -fconstexpr-loop-limit limit, in the loop inside
__builtin_char_array_p, but you'd hit the exact same limit in the
__equal<false>::equal's for loop anyway.  Bumping that limit, the test still
works, though of course compilation takes noticeably longer.

~~~~~~~~~~~~~~~~~~~~~~~~
#include <algorithm>
#include <iterator>
#include <stdio.h>

namespace my_std
{
  /* This is currently in char_traits, could/should probably move
     elsewhere.  */
  template<typename _CharT>
    static _GLIBCXX_ALWAYS_INLINE constexpr bool
    __constant_char_array_p(const _CharT* __a, size_t __n)
    {
      size_t __i = 0;
      while (__builtin_constant_p(__a[__i]) && __i < __n)
        __i++;
      return __i == __n;
    }

  template<bool _BoolType>
    struct __equal
    {
      template<typename _II1, typename _II2>
        static constexpr bool
        equal(_II1 __first1, _II1 __last1, _II2 __first2)
        {
          for (; __first1 != __last1; ++__first1, (void) ++__first2)
            if (!(*__first1 == *__first2))
              return false;
          return true;
        }
    };

  template<>
    struct __equal<true>
    {
      template<typename _Tp>
      static constexpr bool
        equal(const _Tp* __first1, const _Tp* __last1, const _Tp* __first2)
        {
          if (const size_t __len = (__last1 - __first1))
            {
#if 1
              // new bits are here
              if (__constant_char_array_p(__first1, __len)
                  && __constant_char_array_p(__first2, __len))
                return __equal<false>::equal(__first1, __last1, __first2);
#endif

              return !__builtin_memcmp(__first1, __first2, sizeof(_Tp) *
__len);
            }
          return true;
        }
    };

  template<typename _II1, typename _II2>
  constexpr inline bool
  equal(_II1 __first1, _II1 __last1, _II2 __first2)
  {
      typedef typename std::iterator_traits<_II1>::value_type _ValueType1;
      typedef typename std::iterator_traits<_II2>::value_type _ValueType2;
      const bool __simple = ((std::__is_integer<_ValueType1>::__value
                              || std::__is_pointer<_ValueType1>::__value)
                             && std::__is_pointer<_II1>::__value
                             && std::__is_pointer<_II2>::__value
                             && std::__are_same<_ValueType1,
_ValueType2>::__value);

      return my_std::__equal<__simple>::equal(__first1, __last1, __first2);
    }
}

struct S
{
  constexpr bool operator==(const S& rhs)
  {
    return i == rhs.i;
  }

  int i = 0;
};

template<typename T>
constexpr bool equal1()
{
  T s1[400] = {1, 2, 3, 1, 2, 3};
  T s2[400] = {1, 2, 3, 1, 2, 3};

  const size_t count = sizeof (s1) / sizeof (s1[0]);

  return (my_std::equal(s1, s1 + 3, s2 + 3)
          && !my_std::equal(s1, s1 + 1, s2 + 1)
          && my_std::equal(s1, s1 + count, s2));
}

constexpr bool equal2()
{
  int i1 = 0;
  int i2 = 0;

  return (my_std::equal(&i1, &i1 + 1, &i2));
}

constexpr bool equal3()
{
  S s1[400];
  S s2[400];

  const size_t count = sizeof (s1) / sizeof (s1[0]);

  for (size_t i = 0; i < count; i++)
    s1[i].i = s2[i].i = i;

  return (my_std::equal(s1, s1 + count, s2)
          && !my_std::equal(s1, s1 + 1, s2 + 1));
}

// disable if you disable the new bits above too.
#if 1
static_assert (equal1<char>());
static_assert (equal1<int>());
static_assert (equal1<unsigned long long>());
static_assert (equal2());
static_assert (equal3());
#endif

int
main ()
{
#define TEST(EXPR) \
  printf (#EXPR " = %d\n", (int) EXPR)

  TEST (equal1<char>());
  TEST (equal1<int>());
  TEST (equal1<unsigned long long>());
  TEST (equal2());
  TEST (equal3());

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~

Reply via email to