I am trying to read an edge list into a Graph. My data looks like 394365859 --> 136153151 589404147 --> 1361045425
I read it into a Graph via: val edgeFullStrRDD: RDD[String] = sc.textFile(unidirFName) val edgeTupRDD = edgeFullStrRDD.map(x => x.split("\t")) .map(x => (x(0).toLong, x(2).toLong)) val g = Graph.fromEdgeTuples(edgeTupRDD, defaultValue = 123, uniqueEdges = Option(CanonicalRandomVertexCut)) Now, edgeTupRDD.distinct().count() tells me I have 240086 distinct lines in the file, g.numEdges tells me they combined into 240096 weighted edges (which is really weird since that's more lines than in the RDD), but g.edges.distinct().count() tells me I have 10. Why?