Re: Hash containers - some thoughts and recommendations

Aaron Canary Fri, 24 Aug 2018 09:02:52 -0700

Alan sent me an article by skarupke a while back. He has devoted a lot of
time to optimizing hashmaps.
Talk    https://www.youtube.com/watch?v=M2fKMP47slQ
Code: flat_hash_map/bytell_hash_map.hpp at master · skarupke/flat_hash_map
<https://github.com/skarupke/flat_hash_map/blob/master/bytell_hash_map.hpp>


I'm interested if my DB can use this type of intrusive/flat allocation. It
requires that all elements can remain read-only after removed from tables
until refcount is zeroed. That is why I'm using shared_ptr. DBTable,
DBHost, DBAddr, DBAddrPort by a-canary · Pull Request #4110 ·
apache/trafficserver <https://github.com/apache/trafficserver/pull/4110>





On Fri, Aug 24, 2018 at 8:09 AM Alan Carroll
<solidwallofc...@oath.com.invalid> wrote:

> I did some simple testing on IntrusiveHashMap vs. std::unordered_map. The
> first test was insertion and lookup, with 1009 items. The lookup loop was
> done 1000 times, each time looking up each item once, in a different order
> every loop, for roughly 1M lookups. The second case is a double indexed
> insert and lookup with the same set up, with the allocation of the objects
> not included (as would be the case for the shared session table). The
> std::unordered_maps stored raw pointers to the objects, not shared pointers
> to make a little faster. You can see IntrusiveHashMap is somewhat faster on
> inserts and single lookups, but on the double index lookup IntrusiveHashMap
> is almost 4x faster than std::unordered_map. This would tend to indicate
> that the double lookup is in fact a performance hit. These numbers do take
> advantage of the more efficient string lookup in IntrusiveHashMap, because
> it take store a std::string as a key but do the lookup on std::string_view
> so the caller doesn't have to construct a std::string. This would be the
> normal use case inside ATS.
>
> -O0:
> IHM populate 621540ns or 0ms
> IHM find 210755306ns or 210ms
> UM populate 832934ns or 0ms
> UM find 332646889ns or 332ms
> -O2:
> IHM populate 183626ns or 0ms
> IHM find 69232602ns or 69ms
> UM populate 191563ns or 0ms
> UM find 78888346ns or 78ms
> IHM2 populate 208932ns or 0ms
> IHM2 find 20127315ns or 20ms
> UM2 populate 254858ns or 0ms
> UM2 find 83589823ns or 83ms
>
> On Thu, Aug 23, 2018 at 5:49 PM Bryan Call <bc...@apache.org> wrote:
>
> > Memory allocations on inserting items is a property of the hash table,
> not
> > if it is intrusive vs non-intrusive (however, intrusive is this way by
> > default).  Performance on a better designed hash table would be many
> times
> > faster on find, so even having to do a double find on delete isn’t a
> > performance hit.
> >
>


-- 
Aaron Canary
ATS - Senior Software Engineer

Re: Hash containers - some thoughts and recommendations

Reply via email to