On Saturday, 20 January 2024 at 19:45:19 UTC, Daniel Kozak wrote:
On Sat, Jan 20, 2024 at 2:11 PM Renato via Digitalmars-d-learn
< digitalmars-d-learn@puremagic.com> wrote:
Wow, fantastic feedback from lots of people, thanks so much!
...
> evilrat:
> There is another important difference, i quickly looked up D
> associative array implementation and the search looks like
> nlog(n) complexity with plain loop iteration of hashes,
> whereas
> rust's implementation is using "swisstable" algorithm
> implementation that has packed SIMD optimized lookups, this
> is
> likely where the 3x speed difference comes from.
I am not using the default hash implementation in Rust (which
is hashbrown as you've found out). That's too slow (because
it's ddos secure - still Java's hash also is and it's still
faster). I had to replace that with a library called `ahash`.
If you're interested in knowing more about that, please [read
my blog
post](https://renato.athaydes.com/posts/how-to-write-fast-rust-code) about optimising the Rust code.
So, no, that's not where the difference is coming from... in
fact, I am using a faster hashing function in D.
You can [see my custom hashing function
here](
https://github.com/renatoathaydes/prechelt-phone-number-encoding/commit/d1fa1b77ad928f7d536728fba11f2d69b4afe7e3).
This function is not good for the general purpose case, but
it's probably
as good as it gets for this problem (see my int128 trick in a
previous post
on this thread to understand why). I did try a few variations
of hashing
before settling on this one... Rust, if anything, is at a
disadvantage here.
And here you are wrong, it is the hashing when the slowdown
comes. It is not only about the speed of hashing function. It
is about the quality of hashing functiuon for this particular
problem and it is about how hashing table (associative array)
is implemented.
I've explained why this function is almost a perfect hash
function for this problem (there will be almost no collisions
except for very large inputs where the shift-left will
"overflow", and even then the probability of collisions should be
very low). If you're going to claim I'm wrong, you must show
exactly where I'm wrong, and preferrably you should provide a
faster implementation. I will even accept a theoretical
explanation if you can give one. What I will not accept, is for
you to call me "wrong" just because you say so. That's childish
behaviour and uncalled for on a technical discussion.