> Hi, > looking into reason why we still do throw_bad_alloc in clang binary I noticed > that quite few calls come from deque::_M_reallocate_map. This patch adds > unreachable to limit the size of realloc_map. _M_reallocate_map is called > only > if new size is smaller then max_size. map is an array holding pointers to > entries of fixed size. > > Since rellocation is done by doubling the map size, I think the maximal size > of > map allocated is max_size / deque_buf_size rounded up times two. This should > be also safe for overflows since we have extra bit. > > map size is always at least 8. Theoretically this computation may be wrong for > very large T, but in that case callers should never reallocate. > > On the testcase I get: > jh@shroud:~> ~/trunk-install-new4/bin/g++ -O2 dq.C -c ; size -A dq.o | grep > text > .text 284 0 > .text._ZNSt5dequeIiSaIiEE17_M_reallocate_mapEmb 485 0 > .text.unlikely 10 0 > jh@shroud:~> ~/trunk-install-new5/bin/g++ -O2 dq.C -c ; size -A dq.o | grep > text > .text 284 0 > .text._ZNSt5dequeIiSaIiEE17_M_reallocate_mapEmb 465 0 > .text.unlikely 10 0 > > so this saves about 20 bytes of rellocate_map, which I think is worthwhile. > Curiously enough gcc14 does: > > jh@shroud:~> g++ -O2 dq.C -c ; size -A dq.o | grep text > .text 604 0 > .text.unlikely 10 0 > > which is 145 bytes smaller. Obvoius difference is that _M_reallocate_map gets > inlined. > Compiling gcc14 preprocessed file with trunk gives: > > jh@shroud:~> g++ -O2 dq.C -S ; size -A dq.o | grep text > .text 762 0 > .text.unlikely 10 0 > > So inlining is due to changes at libstdc++ side, but code size growth is due > to > something else. > > For clang this reduced number of thris_bad_new_array_length from 121 to 61. > > Regtested x86_64-linux, OK? > > libstdc++-v3/ChangeLog: > > * include/bits/deque.tcc: Compute maximal size of alloc_map. > > gcc/testsuite/ChangeLog: > > * g++.dg/tree-ssa/deque.C: New test. > > > diff --git a/gcc/testsuite/g++.dg/tree-ssa/deque.C > b/gcc/testsuite/g++.dg/tree-ssa/deque.C > new file mode 100644 > index 00000000000..c79de9b2161 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/tree-ssa/deque.C > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-optimized" } */ > +#include <deque> > +void > +test(std::deque<int> &q, int v) > +{ > + q.push_back (v); > +} > +// { dg-final { scan-tree-dump-not "throw_bad_alloc" "optimized" } } > diff --git a/libstdc++-v3/include/bits/deque.tcc > b/libstdc++-v3/include/bits/deque.tcc > index deb010a0ebb..653354f90a7 100644 > --- a/libstdc++-v3/include/bits/deque.tcc > +++ b/libstdc++-v3/include/bits/deque.tcc > @@ -955,6 +955,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER > size_type __new_map_size = this->_M_impl._M_map_size > + std::max(this->_M_impl._M_map_size, > __nodes_to_add) + 2; > + size_type __max = __deque_buf_size(sizeof(_Tp)); > + if (__new_map_size > ((max_size() + __deque_buf_size(sizeof(_Tp)) - 1) > + / __deque_buf_size(sizeof(_Tp))) * 2) > + __builtin_unreachable ();
I forgot dead variable here which yields to -Wall warning. Also I noticed that deque copy operation may be optimized if we know that size <= max_size. With this change we can now fully optimize away unused deque copies. I am not sure how common it is in practice, but I think all the containers should be optimizable this way. Next interesting challenge (seen at llvm binary) is std::hash. Here is updated patch which got regtested&bootstrapped on x86_64-linux libstdc++-v3/ChangeLog: * include/bits/deque.tcc (std::deque::_M_reallocate_map): Add __builtin_unreachable check to declare that maps are not very large. * include/bits/stl_deque.h (std::deque::size): Add __builtin_unreachable to check for maximal size of map. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/deque-1.C: New test. * g++.dg/tree-ssa/deque-2.C: New test. diff --git a/gcc/testsuite/g++.dg/tree-ssa/deque-1.C b/gcc/testsuite/g++.dg/tree-ssa/deque-1.C new file mode 100644 index 00000000000..c639ebb1a5f --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/deque-1.C @@ -0,0 +1,9 @@ +// { dg-do compile } +// { dg-options "-O1 -fdump-tree-optimized" } +#include <deque> +void +test(std::deque<int> &q, int v) +{ + q.push_back (v); +} +// { dg-final { scan-tree-dump-not "throw_bad_alloc" "optimized" } } diff --git a/gcc/testsuite/g++.dg/tree-ssa/deque-2.C b/gcc/testsuite/g++.dg/tree-ssa/deque-2.C new file mode 100644 index 00000000000..7e268b3f018 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/deque-2.C @@ -0,0 +1,10 @@ +// { dg-do compile } +// { dg-options "-O3 -fdump-tree-optimized" } +#include <deque> +std::deque<int *> +test2(std::deque<int *> &q) +{ + return q; +} +// rethrow is OK, but throw is not. +// { dg-final { scan-tree-dump-not {[^e]throw} "optimized" } } diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc index deb010a0ebb..e56cd0b9319 100644 --- a/libstdc++-v3/include/bits/deque.tcc +++ b/libstdc++-v3/include/bits/deque.tcc @@ -955,6 +955,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER size_type __new_map_size = this->_M_impl._M_map_size + std::max(this->_M_impl._M_map_size, __nodes_to_add) + 2; + if (__new_map_size > ((max_size() + __deque_buf_size(sizeof(_Tp)) - 1) + / __deque_buf_size(sizeof(_Tp))) * 2) + __builtin_unreachable (); _Map_pointer __new_map = this->_M_allocate_map(__new_map_size); __new_nstart = __new_map + (__new_map_size - __new_num_nodes) / 2 diff --git a/libstdc++-v3/include/bits/stl_deque.h b/libstdc++-v3/include/bits/stl_deque.h index c617933bd81..e47e7059330 100644 --- a/libstdc++-v3/include/bits/stl_deque.h +++ b/libstdc++-v3/include/bits/stl_deque.h @@ -1266,7 +1266,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _GLIBCXX_NODISCARD size_type size() const _GLIBCXX_NOEXCEPT - { return this->_M_impl._M_finish - this->_M_impl._M_start; } + { + size_type __siz = this->_M_impl._M_finish - this->_M_impl._M_start; + if (__siz > max_size ()) + __builtin_unreachable (); + return __siz; + } /** Returns the size() of the largest possible %deque. */ _GLIBCXX_NODISCARD