https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Georg Sauthoff from comment #0)
> Created attachment 45259 [details]
> specialize std::find to memchr for character searches in continous memory
>
> If std::find() is called with continuous random access iterators and a
> trivial char sized value, then calling memchr() is much more efficient than
> calling into the generic __find_if().
>
> The attached patch implements this optimization.
>
> That means it specializes a std::find helper on the iterator category and
> the value and calls __builtin_memchr() if possible.
Why specialize on the iterator category, when the __is_simple boolean already
checks if the iterator is a pointer?
The condition of a trivial byte-sized type seem insufficient, because you could
have:
struct B {
char c;
bool operator==(const B& b) const { return true; }
};
I would prefer to do simply:
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -3846,6 +3846,32 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
__glibcxx_function_requires(_EqualOpConcept<
typename iterator_traits<_InputIterator>::value_type, _Tp>)
__glibcxx_requires_valid_range(__first, __last);
+
+#if __cpp_if_constexpr
+ using _ValT = typename iterator_traits<_InputIterator>::value_type;
+ if constexpr (is_same_v<_ValT, _Tp>)
+ if constexpr (__is_byte<_ValT>::__value)
+#if __cpp_lib_concepts
+ if constexpr (contiguous_iterator<_InputIterator>)
+ {
+ if (const size_t __n = __last - __first)
+ {
+ auto __p0 = std::to_address(__first);
+ if (auto __p1 = __builtin_memchr(__p0, __val, __n))
+ return __first + (__p1 - __p0);
+ }
+ return __last;
+ }
+#else
+ if constexpr (is_pointer_v<_InputIterator>)
+ {
+ if (const size_t __n = __last - __first)
+ if (auto __p = __builtin_memchr(__first, __val, __n))
+ return __p;
+ return __last;
+ }
+#endif
+#endif
return std::__find_if(__first, __last,
__gnu_cxx::__ops::__iter_equals_val(__val));
}
I think we're going to remove the manual loop unrolling in __find_if for GCC
15, which should allow the compiler to optimize it better, potentially
auto-vectorizing. That might make memchr less advantageous, but I think it's
worth doing anyway.