Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare

Jonathan Wakely via Gcc-patches Tue, 09 Jun 2020 03:25:09 -0700

On 08/06/20 19:20 +0100, Jonathan Wakely wrote:

On 05/06/20 22:24 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

Hi

Â Â Â Here is the last of my algo patches this time to extend thememcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.

Â Â Â To do so I had to return int in implementation details oflexicographical_compare to make sure we can treat a deque iteratorrange by chunk in a performant way.


Tested under Linux x86_64 normal and debug modes.

Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_algobase.h
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_impl): Return int.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare::__lc): Likewise.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1(_II1, _II1, _II2, 
_II2)): New.
(__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
Â Â Â Â Â Â Â Â Â Â Â  _II2, _II2)): Declare.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1(_II1, _II1,
Â Â Â Â Â Â Â Â Â Â Â  _Deque_iterator<>, _Deque_iterator<>)): Declare.
(__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
Â Â Â Â Â Â Â Â Â Â Â  _Deque_iterator<>, _Deque_iterator<>)): Declare.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux): Adapt, call later.


Is this meant to say "latter"? That's still not correct grammar
though. I think it would be better to name the function it calls
explicitly: "Call __lexicographical_compare_aux1."

Â Â Â Â Â Â Â Â Â Â Â (__lexicographical_compare_aux(_Safe_iterator<>,_Safe_iterator<>,
Â Â Â Â Â Â Â Â Â Â Â  _II2, _II2)): Declare.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(_II1, _II1,
Â Â Â Â Â Â Â Â Â Â Â  _Safe_iterator<>, _Safe_iterator<>)): Declare.
Â Â Â Â Â Â Â Â Â Â Â (__lexicographical_compare_aux(_Safe_iterator<>,_Safe_iterator<>,
Â Â Â Â Â Â Â Â Â Â Â  _Safe_iterator<>, _Safe_iterator<>)): Declare.
Â Â Â Â Â Â Â Â Â Â Â (std::lexicographical_compare): Adapt, call laterwithout __niter_base
Â Â Â Â Â Â Â Â Â Â Â  usage.
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/deque.tcc (__lex_cmp_dit): New.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1): New.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(const _Safe_iterator<>&,
Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, _II2, _II2)): New.
Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(
Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, const_Safe_iterator<>&,
Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, const _Safe_iterator<>&)): New.
Â Â Â Â Â Â Â Â Â Â Â * testsuite/25_algorithms/lexicographical_compare/1.cc(test6, test7):
Â Â Â Â Â Â Â Â Â Â Â  New.
Â Â Â Â Â Â Â Â Â Â Â *testsuite/25_algorithms/lexicographical_compare/deque_iterators/1.cc:
Â Â Â Â Â Â Â Â Â Â Â  New test.

Ok to commit ?

FranÃ§ois

diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc
index 1d32a1691c7..d7dbe64f3e1 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -1261,6 +1261,98 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
     return true;
   }

+  template<typename _Tp, typename _Ref, typename _Ptr, typename _II>


_II is the wrong name here, it mean InputIterator. All callers of this
function are constrained for random access iterators. This cannot be
called with an input iterator. Please use _RAIter.

+    int
+    __lex_cmp_dit(
+       const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __first1,
+       const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __last1,
+       _II __first2, _II __last2)
+    {
+      typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> _Iter;
+      typedef typename _Iter::difference_type difference_type;
+
+      if (__first1._M_node != __last1._M_node)
+       {
+         difference_type __len = __last2 - __first2;
+         difference_type __flen


What does "flen" mean? Why not use len2 for last2 - first2, as we do
elsewhere? And then use len for min(len1, len2)?

+           = std::min(__len, __first1._M_last - __first1._M_cur);
+         if (int __ret = std::__lexicographical_compare_aux1(


This call (and the three later in this function) will do overload
resolution again on the full set of __lexicographical_compare_aux1
overloads, which includes the ones for deque iterators. But we know
that __first1._M_cur and __first1._M_last are not deque iterators.

__first2 could be a deque iterator, but I'm not sure if we really want
one function that has to handle that case anyway.

+             __first1._M_cur, __first1._M_last, __first2, __first2 + __flen))
+           return __ret;
+
+         __first2 += __flen;
+         __len -= __flen;
+         __flen = std::min<size_t>(__len, _Iter::_S_buffer_size());
+         for (typename _Iter::_Map_pointer __node = __first1._M_node + 1;
+              __node != __last1._M_node;
+              __first2 += __flen, __len -= __flen,
+              __flen = std::min<size_t>(__len, _Iter::_S_buffer_size()),
+              ++__node)
+           if (int __ret = std::__lexicographical_compare_aux1(
+                 *__node, *__node + _Iter::_S_buffer_size(),
+                 __first2, __first2 + __flen))
+             return __ret;
+
+         return std::__lexicographical_compare_aux1(
+           __last1._M_first, __last1._M_cur, __first2, __last2);
+       }
+
+      return std::__lexicographical_compare_aux1(
+             __first1._M_cur, __last1._M_cur, __first2, __last2);
+    }
+
+  template<typename _Tp1, typename _Ref1, typename _Ptr1,
+          typename _II2>


This is not an input iterator either.

+    typename __gnu_cxx::__enable_if<
+      __is_random_access_iter<_II2>::__value, int>::__type


This should be inline.

Also, I'm curious why these overloads use __enable_if rather than the
conventional approach of tag dispatching using iterator_category.
(The same question applies to th __equal_aux1 and __copy_move_a1
overloads for deque iterators as well). It doesn't really matter
though.

+    __lexicographical_compare_aux1(
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __first1,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __last1,
+               _II2 __first2, _II2 __last2)
+    { return std::__lex_cmp_dit(__first1, __last1, __first2, __last2); }
+
+  template<typename _Tp1, typename _Ref1, typename _Ptr1,
+          typename _Tp2, typename _Ref2, typename _Ptr2>
+    int


This should be inline.

+    __lexicographical_compare_aux1(
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __first1,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __last1,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __first2,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __last2)
+    { return std::__lex_cmp_dit(__first1, __last1, __first2, __last2); }


I'm not convinced it's useful for this function and the one above it
to share the same logic.

+  template<typename _II1,
+          typename _Tp2, typename _Ref2, typename _Ptr2>
+    typename __gnu_cxx::__enable_if<
+      __is_random_access_iter<_II1>::__value, int>::__type
+    __lexicographical_compare_aux1(
+               _II1 __first1, _II1 __last1,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __first2,
+               _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __last2)
+    {
+      typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> _Iter;
+      typedef typename _Iter::difference_type difference_type;
+
+      difference_type __len = __last1 - __first1;
+      while (__len > 0)
+       {
+         const difference_type __flen = __first2._M_node == __last2._M_node
+           ? __last2._M_cur - __first2._M_cur
+           : __first2._M_last - __first2._M_cur;
+         const difference_type __clen = std::min(__len, __flen);


"flen"? "clen"?

+         if (int __ret = std::__lexicographical_compare_aux1(
+                               __first1, __first1 + __clen,
+                               __first2._M_cur, __first2._M_cur + __flen))
+           return __ret;
+
+         __first1 += __clen;
+         __len -= __clen;
+         __first2 += __clen;
+       }
+
+      return __first1 == __last1 ? (__first2 != __last2 ? -1 : 0) : 1;


Isn't this function doing basically the same as __lex_cmp_dit? Why
does it not use the same logic?

Can this whole function just negate the result of __lex_cmp_dit called
with the arguments switched?

 return -std::__lex_cmp_dit(__first2, __last2, __first1, __last1);

That would also fix the bug that makes the following loop forever:

 std::deque<int> d;
 int i = 0;
 std::lexicographical_compare(&i, &i + 1, d.begin(), d.end());


Same question for __equal_aux1, why is it implemented twice? Why
doesn't __equal_aux1(_II f1, _II l1, _Deque_iterator<> f2) just call
__equal_aux1(f2, f2 + (l1 - f1), f1) ?

Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare

Reply via email to