I think the difference lies somewhere in here:
- RDD writes are done with SparkHadoopMapReduceWriter.executeTask, which
calls outputMetrics.setRecordsWritten
- DF writes are done with InsertIntoHadoopFsRelationCommand.run ? Which I'm
not entirely sure how it works.

executeTask appears to be run on the worker, writing a single RDD partition
out in a single Spark task. I can grok how it works. I'm not entirely sure
where the rubber hits the road for InsertIntoHadoopFsRelationCommand.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to