Has anyone created a design page on dev.clojure for this yet?
On Thursday, October 31, 2013 9:19:23 AM UTC-5, Andy Fingerhut wrote: > > Just a follow up after a fair amount of thinking and experimentation has > been done on this problem. > > Mark Engelberg has a very nice document describing the existing hash > functions used by Clojure, examples with why they can cause collisions so > often, and suggestions for changes to them to have fewer collisions in > these cases. > https://docs.google.com/document/d/10olJzjNHXn1Si1LsSvjNqF_X1O7OTPZLuv3Uo_duY5o/edit > > I have made modifications to a fork of the Clojure code base with his > suggested hash modifications here, for experimentation purposes: > https://github.com/jafingerhut/clojure I also have some modifications > to Paul's N-queens Clojure program to print out additional statistics and > try out different hash functions here: > https://github.com/jafingerhut/chess-clojure > > Running Paul's N-queens problem with a 6x6 board without any changes to > Clojure takes about 7 mins on one of my computers. With those changes to > the hash functions it finishes in about 12 sec. > > I haven't let a 6x9 board run to completion without those hash changes, > but it takes a long time :-) With those changes to the hash functions it > finishes in about 11 minutes. > > The reason for the currently slow behavior is a combination of the current > hash functions and the implementation of the PersistentHashSet data > structure used to implement sets. A PersistentHashSet is, very roughly, a > 32-way branching tree with the hash function of the elements used to decide > which branch to take. When you hit a leaf in the tree, all elements with > the same hash function value are put into an array that is scanned linearly > whenever you want to check whether a value is already in the set. So the > more hash collisions, the more it is like using an array to store and look > up set elements in O(n) time. > > With the 6x9 board, there are 20,136,752 solutions. The way those > solutions are represented in the program as sets of vectors, those 20 > million values only have 17,936 different hash values. Thus the average > length of those arrays of values in the tree leaves is 1,122 values. The > longest of those arrays has 81,610 values in it, all with the same hash > function. > > With Mark's proposed hash function changes, those 20,136,752 values hash > to 20,089,488 different ints, for an average array length of 1 value, and a > longest array length of 3 values. Big speedup. > > There is nothing unique about Mark's particular hash functions that > achieves this. Other hash modifications can give similar improvements. > The Clojure dev team is considering which changes would be good ones to > make, in hopes of changing them once and not having to do it again. > > Andy > > > On Tue, Oct 22, 2013 at 9:28 AM, Paul Butcher > <pa...@paulbutcher.com<javascript:> > > wrote: > >> I've been playing around with a generalised version of the N-Queens >> problem that handles other chess pieces. I have a Scala implementation that >> uses an eager depth-first search. Given enough RAM (around 12G - it's very >> memory hungry!) it can solve a 6x9 board with 6 pieces in around 2.5 >> minutes on my MacBook Pro. >> >> I then created a Clojure version of the same algorithm with the intention >> of (eventually) using laziness to avoid the memory issue. However, the >> performance of the Clojure version is several orders of magnitude worse >> than the Scala version (I've left it running overnight on the problem that >> the Scala version solves in 2.5 minutes and it still hasn't completed). >> >> I'm struggling to see why the performance of the Clojure version is so >> much worse than the Scala. Profiling in YourKit suggests that it's spending >> 99% of its time in PersistentHashSet.cons, but I'm at a loss to explain >> why. I would be grateful for any insight. >> >> The code for the two different versions is on GitHub: >> >> https://github.com/paulbutcher/chess-scala >> https://github.com/paulbutcher/chess-clojure >> >> Notes: >> >> - I know that an exhaustive depth-first search isn't a great way to >> tackle this problem. But I'd like to understand why I see such a dramatic >> difference between the performance of the Scala and Clojure versions. >> >> - I believe that I've implemented the same algorithm in both Scala and >> Clojure - certainly they both generate the same results for small problems >> (there are 3x3 and 4x4 problems commented out in the source). But clearly I >> can't rule out the possibility that I've made a mistake in the Clojure >> implementation. If anyone can spot my stupid mistake, I'd greatly >> appreciate it. >> >> Thanks in advance for any insight that anyone can offer. >> >> -- >> paul.butcher->msgCount++ >> >> Snetterton, Castle Combe, Cadwell Park... >> Who says I have a one track mind? >> >> http://www.paulbutcher.com/ >> LinkedIn: http://www.linkedin.com/in/paulbutcher >> MSN: pa...@paulbutcher.com <javascript:> >> AIM: paulrabutcher >> Skype: paulrabutcher >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com<javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.