Re: Hash containers - some thoughts and recommendations

Walt Karas Fri, 24 Aug 2018 11:33:02 -0700

Is the superiority of open addressing independent of usage patterns?
Does it have the same level of used versus allocated but unused
memory?  Is it assumed the first lookup for the chain bucket
implementation is the head pointer of the chain?


On Fri, Aug 24, 2018 at 12:46 PM, Bryan Call <bc...@apache.org> wrote:
> I stated that a better designed hash table (not a chained hash table) would 
> have much better performance and that a double find would not be a 
> performance hit.  The benchmark was done with two chained hash tables, which 
> are known not to have great performance.
>
> Can you post the source code of the benchmark?
>
> -Bryan
>
>
>
>> On Aug 24, 2018, at 6:09 AM, Alan Carroll <solidwallofc...@oath.com.INVALID> 
>> wrote:
>>
>> I did some simple testing on IntrusiveHashMap vs. std::unordered_map. The
>> first test was insertion and lookup, with 1009 items. The lookup loop was
>> done 1000 times, each time looking up each item once, in a different order
>> every loop, for roughly 1M lookups. The second case is a double indexed
>> insert and lookup with the same set up, with the allocation of the objects
>> not included (as would be the case for the shared session table). The
>> std::unordered_maps stored raw pointers to the objects, not shared pointers
>> to make a little faster. You can see IntrusiveHashMap is somewhat faster on
>> inserts and single lookups, but on the double index lookup IntrusiveHashMap
>> is almost 4x faster than std::unordered_map. This would tend to indicate
>> that the double lookup is in fact a performance hit. These numbers do take
>> advantage of the more efficient string lookup in IntrusiveHashMap, because
>> it take store a std::string as a key but do the lookup on std::string_view
>> so the caller doesn't have to construct a std::string. This would be the
>> normal use case inside ATS.
>>
>> -O0:
>> IHM populate 621540ns or 0ms
>> IHM find 210755306ns or 210ms
>> UM populate 832934ns or 0ms
>> UM find 332646889ns or 332ms
>> -O2:
>> IHM populate 183626ns or 0ms
>> IHM find 69232602ns or 69ms
>> UM populate 191563ns or 0ms
>> UM find 78888346ns or 78ms
>> IHM2 populate 208932ns or 0ms
>> IHM2 find 20127315ns or 20ms
>> UM2 populate 254858ns or 0ms
>> UM2 find 83589823ns or 83ms
>>
>> On Thu, Aug 23, 2018 at 5:49 PM Bryan Call <bc...@apache.org> wrote:
>>
>>> Memory allocations on inserting items is a property of the hash table, not
>>> if it is intrusive vs non-intrusive (however, intrusive is this way by
>>> default).  Performance on a better designed hash table would be many times
>>> faster on find, so even having to do a double find on delete isn’t a
>>> performance hit.
>>>
>

Re: Hash containers - some thoughts and recommendations

Reply via email to