Has anyone created a design page on dev.clojure for this yet? 

On Thursday, October 31, 2013 9:19:23 AM UTC-5, Andy Fingerhut wrote:
>
> Just a follow up after a fair amount of thinking and experimentation has 
> been done on this problem.
>
> Mark Engelberg has a very nice document describing the existing hash 
> functions used by Clojure, examples with why they can cause collisions so 
> often, and suggestions for changes to them to have fewer collisions in 
> these cases.  
> https://docs.google.com/document/d/10olJzjNHXn1Si1LsSvjNqF_X1O7OTPZLuv3Uo_duY5o/edit
>
> I have made modifications to a fork of the Clojure code base with his 
> suggested hash modifications here, for experimentation purposes: 
> https://github.com/jafingerhut/clojure   I also have some modifications 
> to Paul's N-queens Clojure program to print out additional statistics and 
> try out different hash functions here: 
> https://github.com/jafingerhut/chess-clojure
>
> Running Paul's N-queens problem with a 6x6 board without any changes to 
> Clojure takes about 7 mins on one of my computers.  With those changes to 
> the hash functions it finishes in about 12 sec.
>
> I haven't let a 6x9 board run to completion without those hash changes, 
> but it takes a long time :-)  With those changes to the hash functions it 
> finishes in about 11 minutes.
>
> The reason for the currently slow behavior is a combination of the current 
> hash functions and the implementation of the PersistentHashSet data 
> structure used to implement sets.  A PersistentHashSet is, very roughly, a 
> 32-way branching tree with the hash function of the elements used to decide 
> which branch to take.  When you hit a leaf in the tree, all elements with 
> the same hash function value are put into an array that is scanned linearly 
> whenever you want to check whether a value is already in the set.  So the 
> more hash collisions, the more it is like using an array to store and look 
> up set elements in O(n) time.
>
> With the 6x9 board, there are 20,136,752 solutions.  The way those 
> solutions are represented in the program as sets of vectors, those 20 
> million values only have 17,936 different hash values.  Thus the average 
> length of those arrays of values in the tree leaves is 1,122 values.  The 
> longest of those arrays has 81,610 values in it, all with the same hash 
> function.
>
> With Mark's proposed hash function changes, those 20,136,752 values hash 
> to 20,089,488 different ints, for an average array length of 1 value, and a 
> longest array length of 3 values.  Big speedup.
>
> There is nothing unique about Mark's particular hash functions that 
> achieves this.  Other hash modifications can give similar improvements.  
> The Clojure dev team is considering which changes would be good ones to 
> make, in hopes of changing them once and not having to do it again.
>
> Andy
>
>
> On Tue, Oct 22, 2013 at 9:28 AM, Paul Butcher 
> <pa...@paulbutcher.com<javascript:>
> > wrote:
>
>> I've been playing around with a generalised version of the N-Queens 
>> problem that handles other chess pieces. I have a Scala implementation that 
>> uses an eager depth-first search. Given enough RAM (around 12G - it's very 
>> memory hungry!) it can solve a 6x9 board with 6 pieces in around 2.5 
>> minutes on my MacBook Pro.
>>
>> I then created a Clojure version of the same algorithm with the intention 
>> of (eventually) using laziness to avoid the memory issue. However, the 
>> performance of the Clojure version is several orders of magnitude worse 
>> than the Scala version (I've left it running overnight on the problem that 
>> the Scala version solves in 2.5 minutes and it still hasn't completed).
>>
>> I'm struggling to see why the performance of the Clojure version is so 
>> much worse than the Scala. Profiling in YourKit suggests that it's spending 
>> 99% of its time in PersistentHashSet.cons, but I'm at a loss to explain 
>> why. I would be grateful for any insight.
>>
>> The code for the two different versions is on GitHub:
>>
>> https://github.com/paulbutcher/chess-scala
>> https://github.com/paulbutcher/chess-clojure
>>
>> Notes:
>>
>> - I know that an exhaustive depth-first search isn't a great way to 
>> tackle this problem. But I'd like to understand why I see such a dramatic 
>> difference between the performance of the Scala and Clojure versions.
>>
>> - I believe that I've implemented the same algorithm in both Scala and 
>> Clojure - certainly they both generate the same results for small problems 
>> (there are 3x3 and 4x4 problems commented out in the source). But clearly I 
>> can't rule out the possibility that I've made a mistake in the Clojure 
>> implementation. If anyone can spot my stupid mistake, I'd greatly 
>> appreciate it.
>>
>> Thanks in advance for any insight that anyone can offer.
>>
>> --
>> paul.butcher->msgCount++
>>
>> Snetterton, Castle Combe, Cadwell Park...
>> Who says I have a one track mind?
>>
>> http://www.paulbutcher.com/
>> LinkedIn: http://www.linkedin.com/in/paulbutcher
>> MSN: pa...@paulbutcher.com <javascript:>
>> AIM: paulrabutcher
>> Skype: paulrabutcher
>>  
>>  -- 
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clo...@googlegroups.com<javascript:>
>> Note that posts from new members are moderated - please be patient with 
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+u...@googlegroups.com <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clojure+u...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to