On Thu, 23 Nov 2023 at 15:44, Jan Hubicka <hubi...@ucw.cz> wrote: > > Hi, > so if I understand it right, it should be safe to simply replace memmove > by memcpy. I wonder if we can get rid of the count != 0 check at least > for glibc systems.
I don't think we can do that. It's still undefined with glibc, and glibc marks it with __attribute__((nonnull)), and ubsan will diagnose it. > In general push_back now need inline-insns-auto to > be 33 to be inlined at -O2 > > > jh@ryzen4:/tmp> cat ~/tt.C > #include <vector> > typedef unsigned int uint32_t; > struct pair_t {uint32_t first, second;}; > struct pair_t pair; > void > test() > { > std::vector<pair_t> stack; > stack.push_back (pair); > while (!stack.empty()) { > pair_t cur = stack.back(); > stack.pop_back(); > if (!cur.first) > { > cur.second++; > stack.push_back (cur); > } > if (cur.second > 10000) > break; > } > } > int > main() > { > for (int i = 0; i < 10000; i++) > test(); > } > > jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param > max-inline-insns-auto=32 ; time ./a.out > > real 0m0.399s > user 0m0.399s > sys 0m0.000s > jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param > max-inline-insns-auto=33 ; time ./a.out > > real 0m0.039s > user 0m0.039s > sys 0m0.000s > > Current inline limit is 15. We can save > - 2 insns if inliner knows that conditional guarding > builtin_unreachable will die (I have patch for this) > - 4 isnsn if we work out that on 64bit hosts allocating vector with > 2^63 elements is impossible > - 2 insns if we allow NULL parameter on memcpy I don't think we can do that. > - 2 insns if we allos NULL parameter on delete That's allowed, I think we just check first to avoid making a function call if it's null, because we know operator delete will do nothing. But if it's hurting inlining, maybe that's the wrong choice. > So thi is 23 instructions. Inliner has hinting which could make > push_back reasonable candidate for -O2 inlining and then we could be > able to propagate interesitng stuff across repeated calls to push_back. > > libstdc++-v3/ChangeLog: > > * include/bits/stl_uninitialized.h (relocate_a_1): Use memcpy instead > of memmove. This patch is OK for trunk. > > diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h > b/libstdc++-v3/include/bits/stl_uninitialized.h > index 1282af3bc43..a9b802774c6 100644 > --- a/libstdc++-v3/include/bits/stl_uninitialized.h > +++ b/libstdc++-v3/include/bits/stl_uninitialized.h > @@ -1119,14 +1119,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > #ifdef __cpp_lib_is_constant_evaluated > if (std::is_constant_evaluated()) > { > - // Can't use memmove. Wrap the pointer so that __relocate_a_1 > + // Can't use memcpy. Wrap the pointer so that __relocate_a_1 > // resolves to the non-trivial overload above. > __gnu_cxx::__normal_iterator<_Tp*, void> __out(__result); > __out = std::__relocate_a_1(__first, __last, __out, __alloc); > return __out.base(); > } > #endif > - __builtin_memmove(__result, __first, __count * sizeof(_Tp)); > + __builtin_memcpy(__result, __first, __count * sizeof(_Tp)); > } > return __result + __count; > } >