I am running a spark job with only two operations: mapPartition and then collect(). The output data size of mapPartition is very small. One integer per partition. I saw there is a stage 2 for this job that runs this java program. I am not a java programmer. Could anyone please let me know what this java program does? or simply how to get rid of this from running, or at least get it run faster? The collect() call is not important to me. All the work was done in mapPartition which sends out data to a k-v store. It's sth like foreachPartition. But I cannot get foreachPartition() to run somehow. Spark 1.1.1.
Thanks! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/What-does-NativeMethodAccessorImpl-java-do-tp13667.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org