I’m running into something very strange today. I’m getting an error on the follow innocuous operations.
a = sc.textFile('s3n://...') a = a.repartition(8) a = a.map(...) c = a.countByKey() # ERRORs out on this action. See below for traceback. [1] If I add a count() right after the repartition(), this error magically goes away. a = sc.textFile('s3n://...') a = a.repartition(8) print a.count() a = a.map(...) c = a.countByKey() # A-OK! WTF? To top it off, this “fix” is inconsistent. Sometimes, I still get this error. This is strange. How do I get to the bottom of this? Nick [1] Here’s the traceback: Traceback (most recent call last): File "<stdin>", line 7, in <module> File "file.py", line 187, in function_blah c = a.countByKey() File "/root/spark/python/pyspark/rdd.py", line 778, in countByKey return self.map(lambda x: x[0]).countByValue() File "/root/spark/python/pyspark/rdd.py", line 624, in countByValue return self.mapPartitions(countPartition).reduce(mergeMaps) File "/root/spark/python/pyspark/rdd.py", line 505, in reduce vals = self.mapPartitions(func).collect() File "/root/spark/python/pyspark/rdd.py", line 469, in collect bytesInJava = self._jrdd.collect().iterator() File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", line 537, in __call__ File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o46.collect. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-adding-a-random-count-change-the-behavior-of-my-program-tp5406.html Sent from the Apache Spark User List mailing list archive at Nabble.com.