Thanks Erik.
It turns out that for me, Dict is around 20% faster than sorted vector.
-- Greg
 

On Thursday, August 18, 2016 at 10:11:42 AM UTC+10, Erik Schnetter wrote:

> For the insertion stage, the Dict is likely ideal.
>
> For the second stage, a Dict that is based on hashing is a good data 
> structure. Another data structure worth examining would be a sorted vector 
> of `(Int64, Float64)` pairs (sorted by the `Int64` keys). Interpolation 
> search <https://en.wikipedia.org/wiki/Interpolation_search> can then have 
> a complexity of `O(log log N)`. My personal guess is that a well-optimized 
> Dict (i.e. with both hash function and hash table size "optimized") will be 
> faster if we assume a close to random access, as it will have fewer memory 
> accesses.
>
> Julia's Dict implementation (see base/dict.jl) has constants that you 
> could tune (read: play with). There is also a function `Base.rehash!` that 
> you can call to increase the size of the hash table, which might increase 
> performance by avoiding hash collisions, if you have sufficient memory.
>
> -erik
>
>
> On Wed, Aug 17, 2016 at 7:58 PM, 'Greg Plowman' via julia-users <
> [email protected] <javascript:>> wrote:
>
>> I need fast random access to a Dict{Int64,Float64}.
>> My application has a first phase in which the Dict is populated, and a 
>> second phase where it accessed randomly (with no further additions or 
>> deletions).
>> There are about 50,000 to 100,000 entries, with keys in the range 10^9 to 
>> 10^14.
>>
>> Firstly is a Dict the most optimal data structure? (presumably Dict is 
>> faster than SparseVector for random lookup?)
>>
>> Secondly, is there any "preconditioning" I can do after creating the Dict 
>> in phase 1 but before phase 2, to optimise the Dict for random access. 
>> (e.g. sizehint, sorting keys)
>>
>> I do appreciate access might already be fast and optimal and I should 
>> just benchmark, but I'm just looking for some theoretical 
>> insight before benchmarking.
>>
>>
>>
>
>
> -- 
> Erik Schnetter <[email protected] <javascript:>> 
> http://www.perimeterinstitute.ca/personal/eschnetter/
>

Reply via email to