Am 29.10.2013 10:09, schrieb Karsten Blees:
> On Mon, Oct 28, 2013 at 10:04 PM, Vicent Martí <tan...@gmail.com> wrote:
>>
>> On Mon, Oct 28, 2013 at 8:45 PM, Karsten Blees <karsten.bl...@gmail.com> 
>> wrote:
>>
>>> Regarding performance, khash uses open addressing, which requires more key 
>>> comparisons (O(1/(1-load_factor))) than chaining (O(1+load_factor)). 
>>> However, any measurable differences will most likely be dwarfed by IO costs 
>>> in this particular use case.
>>
>> I don't think this is true. If you actually run a couple insertion and
>> lookup benchmarks comparing the two implementations, you'll find khash
>> to be around ~30% faster for most workloads (venturing a guess from
>> past experience). I am obviously not the author of khash, but I've
...

Just out of curiosity, I added performance test code for khash to the test in 
my current hashmap patch series [1]. It turns out that khash is by far the 
slowest of the bunch, especially with many collisions.

Again, I don't think that performance matters all that much (or in other words: 
_any_ hash table implementation will probably be fast enough compared to the 
rest that's going on). Its more a question of whether we really need two 
different hash table implementations (and a queasy feeling about the macro 
kludge in khash.h...).


Khash doesn't store the hash codes along with the entries (as both hash.[ch] 
and hashmap.[ch] do), so it needs to re-calculate hash codes on every resize. 
For a fair comparison, the "khash" test uses keys with pre-calculated hash 
codes in the key structure. This should be similar to a hash function that just 
copies 4 bytes from a sha1 key. Khash maps with more complex hash functions 
will be slower (see khstr).

The "khstr" test uses khash's predefined string map and khash's X31 hash 
function for strings (therefore no separate values for different hash functions 
here).

The table is similar to what I posted for hashmap-v2 [2] (i.e. real time in 
seconds for 1,000 rounds á 100,000 entries). I just turned it around a bit to 
make room for khash columns.

test | hash_fn | hashmap |  hash   |  khash  | khstr  |
-----+---------+---------+---------+---------+--------+
     | FNV     |   2.429 |  14.366 |  11.780 | 18.677 |
     | FNV  x2 |   2.946 |  14.558 |  10.922 |        |
 add | i       |   1.708 |   7.419 |   4.132 |        |
     | i    x2 |   1.791 |   8.565 |   4.502 |        |
     | i/10    |   1.555 |   1.805 | 344.691 |        |
     | i/10 x2 |   1.543 |   1.808 | 319.559 |        |
-----+---------+---------+---------+---------+--------+
     | FNV     |   1.822 |   3.452 |   4.922 |  8.309 |
get  | FNV  x2 |   2.298 |   3.194 |   4.473 |        |
100% | i       |   1.252 |   1.344 |   0.944 |        |
hits | i    x2 |   1.286 |   1.434 |   1.220 |        |
     | i/10    |   6.720 |   5.138 | 281.815 |        |
     | i/10 x2 |   6.297 |   5.188 | 257.021 |        |
-----+---------+---------+---------+---------+--------+
     | FNV     |   1.023 |   3.949 |   4.115 |  4.878 |
get  | FNV  x2 |   1.538 |   3.915 |   4.571 |        |
10%  | i       |   0.654 | 397.457 |  38.125 |        |
hits | i    x2 |   0.718 |   0.722 |   9.111 |        |
     | i/10    |   1.128 |  30.235 |  60.376 |        |
     | i/10 x2 |   1.260 |   1.082 |  43.354 |        |
-----+---------+---------+---------+---------+--------+

[1] https://github.com/kblees/git/commits/kb/hashmap-v5-khash
[2] http://article.gmane.org/gmane.comp.version-control.git/235290


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to