Re: BUG: graph.triplets does not return proper values

2014-05-20 Thread GlennStrycker
For some reason it does not appear when I hit "tab" in Spark shell, but when I put everything together in one line, it DOES WORK! orig_graph.edges.map(_.copy()).cartesian(orig_graph.edges.map(_.copy())).flatMap( A => Seq(if (A._1.srcId == A._2.dstId) Edge(A._2.srcId,A._1.dstId,1) else if (A._1.dst

Re: BUG: graph.triplets does not return proper values

2014-05-20 Thread GlennStrycker
I don't seem to have this function in my Spark installation for this object, or the classes MappedRDD, FlatMappedRDD, EdgeRDD, VertexRDD, or Graph. Which class should have the reduceByKey function, and how do I cast my current RDD as this class? Perhaps this is still due to my Spark installation

Re: BUG: graph.triplets does not return proper values

2014-05-20 Thread GlennStrycker
Wait a minute... doesn't a reduce function return 1 element PER key pair? For example, word-count mapreduce functions return a {word, count} element for every unique word. Is this supposed to be a 1-element RDD object? The .reduce function for a MappedRDD or FlatMappedRDD both are of the form

Re: BUG: graph.triplets does not return proper values

2014-05-20 Thread GlennStrycker
Oh... ha, good point. Sorry, I'm new to mapreduce programming and forgot about that... I'll have to adjust my reduce function to output a vector/RDD as the element to return. Thanks for reminding me of this! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabbl

Re: BUG: graph.triplets does not return proper values

2014-05-19 Thread GlennStrycker
I tried adding .copy() everywhere, but still only get one element returned, not even an RDD object. orig_graph.edges.map(_.copy()).flatMap(edge => Seq(edge) ).map(edge => (Edge(edge.copy().srcId, edge.copy().dstId, edge.copy().attr), 1)).reduce( (A,B) => { if (A._1.copy().dstId == B._1.copy().srcI

Re: BUG: graph.triplets does not return proper values

2014-05-19 Thread GlennStrycker
Thanks, rxin, this worked! I am having a similar problem with .reduce... do I need to insert .copy() functions in that statement as well? This part works: orig_graph.edges.map(_.copy()).flatMap(edge => Seq(edge) ).map(edge => (Edge(edge.copy().srcId, edge.copy().dstId, edge.copy().attr), 1)).coll

BUG: graph.triplets does not return proper values

2014-05-19 Thread GlennStrycker
graph.triplets does not work -- it returns incorrect results I have a graph with the following edges: orig_graph.edges.collect = Array(Edge(1,4,1), Edge(1,5,1), Edge(1,7,1), Edge(2,5,1), Edge(2,6,1), Edge(3,5,1), Edge(3,6,1), Edge(3,7,1), Edge(4,1,1), Edge(5,1,1), Edge(5,2,1), Edge(5,3,1), Edge(

Re: Scala examples for Spark do not work as written in documentation

2014-05-16 Thread GlennStrycker
Why does the reduce function only work on sums of keys of the same type and does not support other functional forms? I am having trouble in another example where instead of 1s and 0s, the output of the map function is something like A=(1,2) and B=(3,4). I need a reduce function that can return so

reduce only removes duplicates, cannot be arbitrary function

2014-05-16 Thread GlennStrycker
I am attempting to write a mapreduce job on a graph object to take an edge list and return a new edge list. Unfortunately I find that the current function is def reduce(f: (T, T) => T): T not def reduce(f: (T1, T2) => T3): T I see this because the following 2 commands give different results f

Scala examples for Spark do not work as written in documentation

2014-05-16 Thread GlennStrycker
On the webpage http://spark.apache.org/examples.html, there is an example written as val count = spark.parallelize(1 to NUM_SAMPLES).map(i => val x = Math.random() val y = Math.random() if (x*x + y*y < 1) 1 else 0 ).reduce(_ + _) println("Pi is roughly " + 4.0 * count / NUM_SAMPLES) This do