-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I am the author of https://github.com/whilo/hasch

Would calling hasch.core/edn-hash satisfy your performance
requirements? I tried hard to make the recursion of the protocol
performant, but hashing a value is slower than the time needed to
write the data to disk for big collections. You should pick a faster
message-digest like you suggested, e.g. MD5:

(defn ^MessageDigest md5-message-digest []
  (MessageDigest/getInstance "md5"))

(edn-hash {:foo "Bar" :baz 5} md5-message-digest)

You can use the criterium benchmarking snippets in platform.clj to do
benchmarks. Object.hashCode() is a lot faster still and caches the
result, I am not sure how much overhead the protocol dispatch causes.

Note that if some collisions are ok for you, you might find a better
tradeoff, since atm. commutative collections like maps and sets are
hashed key-value wise and then XOR'd for safety. I am interested in
your findings and decision, especially if you pick something else.

Christian

On 10.08.2015 09:00, Atamert Ölçgen wrote:
> Hi,
> 
> I need a way to reduce a compound value, say {:foo "bar"}, into a
> number (like 693d9a0698aff95c in hex). I don't necessarily need a
> very large hash space, 7 hex digits is good enough for my purposes.
> But I need this hash to be consistent between runs and JVM versions
> etc. So I guess that rules out standard object hashes.
> 
> I would like to find a sufficiently fast way to do this. I can live
> with MD5, but are there faster alternatives (but produce smaller
> hashes)? ( clj-digest <https://github.com/tebeka/clj-digest>
> provides a nice interface to what Java provides but there are only
> usual suspects AFAICS 
> <http://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#MessageDigest>
>
> 
)
> 
> I will be dealing with unordered collections, but it seems hashing
> is consistent when the input order is changed:
> 
> user=> (.hashCode {:foo "Bar" :baz 5}) 2040536238 user=> (.hashCode
> {:baz 5 :foo "Bar"}) 2040536238
> 
> 
> (It even gave the same hash code in different runs.)
> 
> I will use these hashes to build index tables. My data, that
> contains these things I hash is a set. I will store this as an
> ordered set and keep an index pointing to where records from this
> hash to that hash lives. This is all Clojure, but I can't keep all
> my data in memory. (So Clojure's persistent data structures is out
> of the picture. life would've been much simpler if I could.)
> 
> Thanks for reading. Any insight is appreciated.
> 


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJVyJ3vAAoJEKel+aujRZMkbhMIAJ61DGUWM9JoN/JcIxvh2Jph
VohlWbr1yw69D+x4guGOk5AXUh7HMAkmlbuc+YRRnYqGhZtc3r/6C/d/aa5faBAh
NdIeDa8yNHTAuYERDktfviy+q5a/blJRdvIIe7ntyjpDZyd2gD1AwUGYOKctXipS
wMPan7v7yPfPlFfnl+VVXfP8yx/LWyZbwfu0Ugv2B2NhvqPMu8joyondOz7GPcLd
P7EgpIrvfQAElA4c4+UB0BEeJkn+fnpYF3QLJIy5oQny5QwbVtxgVuUNES8EolYl
HkpFY1ECV/M65fvP6wrcYPihuphSYQoPkfY4ZQfzWCq9mo+3Aj1Jq2u7QfG9HxM=
=1UE6
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to