Thanks Erik. It turns out that for me, Dict is around 20% faster than sorted vector. -- Greg
On Thursday, August 18, 2016 at 10:11:42 AM UTC+10, Erik Schnetter wrote: > For the insertion stage, the Dict is likely ideal. > > For the second stage, a Dict that is based on hashing is a good data > structure. Another data structure worth examining would be a sorted vector > of `(Int64, Float64)` pairs (sorted by the `Int64` keys). Interpolation > search <https://en.wikipedia.org/wiki/Interpolation_search> can then have > a complexity of `O(log log N)`. My personal guess is that a well-optimized > Dict (i.e. with both hash function and hash table size "optimized") will be > faster if we assume a close to random access, as it will have fewer memory > accesses. > > Julia's Dict implementation (see base/dict.jl) has constants that you > could tune (read: play with). There is also a function `Base.rehash!` that > you can call to increase the size of the hash table, which might increase > performance by avoiding hash collisions, if you have sufficient memory. > > -erik > > > On Wed, Aug 17, 2016 at 7:58 PM, 'Greg Plowman' via julia-users < > [email protected] <javascript:>> wrote: > >> I need fast random access to a Dict{Int64,Float64}. >> My application has a first phase in which the Dict is populated, and a >> second phase where it accessed randomly (with no further additions or >> deletions). >> There are about 50,000 to 100,000 entries, with keys in the range 10^9 to >> 10^14. >> >> Firstly is a Dict the most optimal data structure? (presumably Dict is >> faster than SparseVector for random lookup?) >> >> Secondly, is there any "preconditioning" I can do after creating the Dict >> in phase 1 but before phase 2, to optimise the Dict for random access. >> (e.g. sizehint, sorting keys) >> >> I do appreciate access might already be fast and optimal and I should >> just benchmark, but I'm just looking for some theoretical >> insight before benchmarking. >> >> >> > > > -- > Erik Schnetter <[email protected] <javascript:>> > http://www.perimeterinstitute.ca/personal/eschnetter/ >
