On Mon, Mar 22, 2010 at 5:39 AM, Michał Marczyk
<michal.marc...@gmail.com> wrote:
>
> Not sure if I want to draw any particular conclusion from this...
> Probably not, since I'm not yet done wondering to which degree I might
> be correct in thinking it. :-) I do want to stress once again the need
> for benchmarking with expensive tasks, though. Addition is so cheap
> that it's guaranteed not to be worth the bookkeeping cost of fancy
> work splitting solutions.

Sure. I'm testing the code with very light tasks purely for evaluating
the internal overhead of the algorithm. That's not what I'd use in a
real application.

Anyway, I wrote some vector decomposition code in Java that seems to
yield good results. Please find the source files attached. I'm not
very familiar with Clojure implementation so the code is pretty ugly,
but it works for my limited purposes.

The code provides a "partition" method for vectors, which decomposes
the vector into chunks exposing the internal structure of the
PersistentVector. Data are not copied or moved around (except for
allocating new wrapping subvectors) so the code is reasonably fast. It
does not, however, attempt to balance the tree. Currently the
"partition" method decomposes the vector down to single chunks,
perhaps it makes sense to let the user limit the depth of this process
to control the size and number of chunks.

I've tested the code with:

(defn sum_seq2 [vec]
  (reduce + vec))

(defn- sum_tree_int5 [vec]
  (let [s (seq vec)]
    (if (vector? (first s))
      (reduce + (map sum_tree_int5 vec))
      (.reduce (chunk-first s) + 0))))

(defn sum_tree5 [vec]
  (let [v (. vec (partition))]
    (sum_tree_int5 v)))

(def l3 (vec (range 1 1000000)))

user=> (dotimes [_ 5] (time (sum_seq2 l3)))
"Elapsed time: 72.643896 msecs"
"Elapsed time: 77.785335 msecs"
"Elapsed time: 71.331996 msecs"
"Elapsed time: 70.790584 msecs"
"Elapsed time: 73.085287 msecs"
nil
user=> (dotimes [_ 5] (time (sum_tree5 l3)))
"Elapsed time: 88.355671 msecs"
"Elapsed time: 96.527659 msecs"
"Elapsed time: 96.205825 msecs"
"Elapsed time: 95.339803 msecs"
"Elapsed time: 95.249012 msecs"
nil

If you replace "map" with "pmap" the results get worse (tested on a
2-core cpu) which is not that surprising, taking into account the size
of chunks and tasks:

user=> (dotimes [_ 5] (time (sum_tree5 l3)))
"Elapsed time: 962.212255 msecs"
"Elapsed time: 956.484688 msecs"
"Elapsed time: 967.799812 msecs"
"Elapsed time: 947.298315 msecs"
"Elapsed time: 973.367556 msecs"
nil

Cheers,
Andrzej

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

To unsubscribe from this group, send email to 
clojure+unsubscribegooglegroups.com or reply to this email with the words 
"REMOVE ME" as the subject.
diff -Nru clojure-1.1.0-orig/src/jvm/clojure/lang/PersistentVector.java clojure-1.1.0/src/jvm/clojure/lang/PersistentVector.java
--- clojure-1.1.0-orig/src/jvm/clojure/lang/PersistentVector.java	2009-12-31 09:11:38.000000000 +0900
+++ clojure-1.1.0/src/jvm/clojure/lang/PersistentVector.java	2010-03-22 16:43:47.000000000 +0900
@@ -18,8 +18,8 @@
 public class PersistentVector extends APersistentVector implements IEditableCollection{
 
 static class Node{
-	final AtomicReference<Thread> edit;
-	final Object[] array;
+	public final AtomicReference<Thread> edit;
+	public final Object[] array;
 
 	Node(AtomicReference<Thread> edit, Object[] array){
 		this.edit = edit;
@@ -35,10 +35,10 @@
 final static AtomicReference<Thread> NOEDIT = new AtomicReference<Thread>(null);
 final static Node EMPTY_NODE = new Node(NOEDIT, new Object[32]);
 
-final int cnt;
-final int shift;
-final Node root;
-final Object[] tail;
+public final int cnt;
+public final int shift;
+public final Node root;
+public final Object[] tail;
 
 public final static PersistentVector EMPTY = new PersistentVector(0, 5, EMPTY_NODE, new Object[]{});
 
@@ -84,7 +84,7 @@
 	return new TransientVector(this);
 }
 
-final int tailoff(){
+public final int tailoff(){
 	if(cnt < 32)
 		return 0;
 	return ((cnt - 1) >>> 5) << 5;
@@ -209,6 +209,19 @@
 	return ret;
 }
 
+    public PersistentVector partition(){
+	PersistentVector tailVec = new PersistentVector(tail.length, 5, EMPTY_NODE, tail);
+	int length = tailoff();
+	if (length == 0)
+	    return tailVec;
+	int rootLength = length;
+	rootLength = ((rootLength - 1) >>> shift) + 1;
+	while (rootLength > 32)
+	    rootLength = ((rootLength - 1) >>> 5) + 1;
+	VectorChunk rootVec = new VectorChunk(length, rootLength, shift, root.array);
+	return PersistentVector.create(rootVec, tailVec);
+    }
+
 public IChunkedSeq chunkedSeq(){
 	if(count() == 0)
 		return null;
diff -Nru clojure-1.1.0-orig/src/jvm/clojure/lang/VectorChunk.java clojure-1.1.0/src/jvm/clojure/lang/VectorChunk.java
--- clojure-1.1.0-orig/src/jvm/clojure/lang/VectorChunk.java	1970-01-01 09:00:00.000000000 +0900
+++ clojure-1.1.0/src/jvm/clojure/lang/VectorChunk.java	2010-03-22 16:50:24.000000000 +0900
@@ -0,0 +1,49 @@
+package clojure.lang;
+
+public class VectorChunk extends PersistentVector {
+    public final Object[] vec;
+    public final int cnt_tot;
+    public final int cnt;
+    public final int shift;
+    
+    VectorChunk(int cnt_tot, int cnt, int shift, Object[] vec) {
+	super(cnt, shift, EMPTY_NODE, vec);
+	this.vec = vec;
+	this.cnt_tot = cnt_tot;
+	this.cnt = cnt;
+	this.shift = shift;
+    }
+    
+    public Object nth(int i){
+	if(i >= 0 && i < cnt)
+	    {
+		Object result;
+		Node node = (Node) vec[i];
+		int length, rootLength;
+		int sliceLength = 32 << (shift - 5);
+		if (i == cnt - 1) {
+		    length = cnt_tot - (cnt - 1) * sliceLength;
+		    rootLength = length;
+		    while (rootLength > 32)
+			rootLength = ((rootLength - 1) >>> 5) + 1;
+		} else {
+		    length = sliceLength;
+		    rootLength = 32;
+		}
+		if (shift > 5) {
+		    result = new VectorChunk(length, rootLength, shift-5, node.array);
+		} else
+		    result = new PersistentVector(node.array.length, shift, EMPTY_NODE, node.array);
+		return result;
+		}
+	throw new IndexOutOfBoundsException();
+    }
+    
+    public ISeq seq(){
+	if(count() > 0)
+	    return new Seq(this, 0);
+	return null;
+    }
+
+}
+

Reply via email to