On Saturday, 20 January 2024 at 19:45:19 UTC, Daniel Kozak wrote:
On Sat, Jan 20, 2024 at 2:11 PM Renato via Digitalmars-d-learn < digitalmars-d-learn@puremagic.com> wrote:

Wow, fantastic feedback from lots of people, thanks so much! ...

> evilrat:
> There is another important difference, i quickly looked up D
> associative array implementation and the search looks like
> nlog(n) complexity with plain loop iteration of hashes, > whereas
> rust's implementation is using "swisstable" algorithm
> implementation that has packed SIMD optimized lookups, this > is
> likely where the 3x speed difference comes from.

I am not using the default hash implementation in Rust (which is hashbrown as you've found out). That's too slow (because it's ddos secure - still Java's hash also is and it's still faster). I had to replace that with a library called `ahash`.

If you're interested in knowing more about that, please [read my blog post](https://renato.athaydes.com/posts/how-to-write-fast-rust-code) about optimising the Rust code.

So, no, that's not where the difference is coming from... in fact, I am using a faster hashing function in D.

You can [see my custom hashing function
here](
https://github.com/renatoathaydes/prechelt-phone-number-encoding/commit/d1fa1b77ad928f7d536728fba11f2d69b4afe7e3).
This function is not good for the general purpose case, but it's probably as good as it gets for this problem (see my int128 trick in a previous post on this thread to understand why). I did try a few variations of hashing before settling on this one... Rust, if anything, is at a disadvantage here.


And here you are wrong, it is the hashing when the slowdown comes. It is not only about the speed of hashing function. It is about the quality of hashing functiuon for this particular problem and it is about how hashing table (associative array) is implemented.

I've explained why this function is almost a perfect hash function for this problem (there will be almost no collisions except for very large inputs where the shift-left will "overflow", and even then the probability of collisions should be very low). If you're going to claim I'm wrong, you must show exactly where I'm wrong, and preferrably you should provide a faster implementation. I will even accept a theoretical explanation if you can give one. What I will not accept, is for you to call me "wrong" just because you say so. That's childish behaviour and uncalled for on a technical discussion.

Reply via email to