N.B. All libstdc++ patches should be sent to the libstdc++ list (not
just to the names in the MAINTAINERS file).

On Sat, 14 Dec 2024 at 01:01, Jonathan Wakely <jwak...@redhat.com> wrote:
>
> On Fri, 13 Dec 2024 at 17:12, Tamar Christina <tamar.christ...@arm.com> wrote:
> >
> > Hi All,
> >
> > In GCC 12 there was a ~40% regression in the performance of hashmap->find.
> >
> > This regression came about accidentally:
> >
> > Before GCC 12 the find function was small enough that IPA would inline it 
> > even
> > though it wasn't marked inline.  In GCC-12 an optimization was added to 
> > perform
> > a linear search when the entries in the hashmap are small.
> >
> > This increased the size of the function enough that IPA would no longer 
> > inline.
> > Inlining had two benefits:
> >
> > 1.  The return value is a reference. so it has to be returned and 
> > dereferenced
> >     even though the search loop may have already dereference it.
> > 2.  The pattern is a hard pattern to track for branch predictors.  This 
> > causes
> >     a large number of branch misses if the value is immediately checked and
> >     branched on. i.e. if (a != m.end()) which is a common pattern.
> >
> > The patch fixes both these issues by adding the inline keyword to _M_locate
> > to allow the inliner to consider inlining again.
> >
> > This and the other patches have been ran through serveral benchmarks where
> > the size, number of elements searched for and type (reference vs value) etc
> > were tested.
> >
> > The change shows no statistical regression, but an average find improvement 
> > of
> > ~27% and a range between ~10-60% improvements.  A selection of the results:
> >
> > +-----------+--------------------+-------+----------+
> > | Group     | Benchmark          | Size  | % Inline |
> > +-----------+--------------------+-------+----------+
> > | Find      | unord<uint64_t     | 11264 | 53.52%   |
> > | Find      | unord<uint64_t     | 11264 | 47.98%   |
> > | Find Many | unord<uint64_t     | 11    | 47.62%   |
> > | Find Many | unord<std::string  | 11    | 44.94%   |
> > | Find Many | unord<std::string  | 11    | 44.89%   |
> > | Find Many | unord<uint64_t     | 11    | 40.90%   |
> > | Find      | unord<NonSSOString | 11    | 31.80%   |
> > | Find      | unord<NonSSOString | 11    | 31.17%   |
> > | Find Many | unord<uint64_t     | 352   | 30.57%   |
> > | Find      | unord<uint64_t     | 352   | 28.27%   |
> > | Find Many | unord<uint64_t     | 352   | 26.80%   |
> > | Find Many | unord<NonSSOString | 352   | 26.59%   |
> > | Find Many | unord<NonSSOString | 352   | 25.93%   |
> > | Find      | unord<std::string  | 11    | 25.66%   |
> > | Find Many | unord<NonSSOString | 11    | 23.21%   |
> > | Find Many | unord<std::string  | 352   | 23.12%   |
> > | Find Many | unord<NonSSOString | 11    | 22.04%   |
> > | Find Many | unord<uint64_t     | 11264 | 21.33%   |
> > | Find Many | unord<uint64_t     | 11264 | 20.71%   |
> > | Find      | unord<std::string  | 11    | 20.36%   |
> > | Find Many | unord<std::string  | 352   | 19.23%   |
> > | Find      | unord<std::string  | 352   | 18.59%   |
> > | Find      | unord<uint64_t     | 352   | 15.43%   |
> > | Find      | unord<NonSSOString | 352   | 13.79%   |
> > | Find      | unord<std::string  | 11264 | 11.80%   |
> > | Find      | unord<std::string  | 352   | 11.12%   |
> > | Find      | val<uint64_t       | 11    | 9.98%    |
> > | Find      | unord<std::string  | 11264 | 9.97%    |
> > | Find      | unord<NonSSOString | 352   | 9.69%    |
> > | Find      | unord<NonSSOString | 11264 | 9.19%    |
> > +-----------+--------------------+-------+----------+
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master? and possible backports?
>
> Thanks for the thorough benchmarking.
>
> OK for trunk.
>
> A different patch will be needed for the release branches because
> _M_locate only exists on trunk.
>
> >
> > Thanks,
> > Tamar
> >
> > libstdc++-v3/ChangeLog:
> >
> >         * include/bits/hashtable.h: Inline _M_locate.
> >
> > ---
> > diff --git a/libstdc++-v3/include/bits/hashtable.h 
> > b/libstdc++-v3/include/bits/hashtable.h
> > index 
> > b8bd8c2f41816a800bab9b3589fe609b16285ad1..e791e52ec329277474f3218d8a44cd37ded14ac3
> >  100644
> > --- a/libstdc++-v3/include/bits/hashtable.h
> > +++ b/libstdc++-v3/include/bits/hashtable.h
> > @@ -2213,7 +2213,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >            typename _ExtractKey, typename _Equal,
> >            typename _Hash, typename _RangeHash, typename _Unused,
> >            typename _RehashPolicy, typename _Traits>
> > -    auto
> > +    inline auto
> >      _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
> >                _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
> >      _M_locate(const key_type& __k) const
> >
> >
> >
> >
> > --

Reply via email to