OutputMetrics empty for DF writes - any hints?

2017-11-27 Thread Jason White
I'd like to use the SparkListenerInterface to listen for some metrics for monitoring/logging/metadata purposes. The first ones I'm interested in hooking into are recordsWritten and bytesWritten as a measure of throughput. I'm using PySpark to write Parquet files from DataFrames. I'm able to extrac

Re: OutputMetrics empty for DF writes - any hints?

2017-11-27 Thread Reynold Xin
Is this due to the insert command not having metrics? It's a problem we should fix. On Mon, Nov 27, 2017 at 10:45 AM, Jason White wrote: > I'd like to use the SparkListenerInterface to listen for some metrics for > monitoring/logging/metadata purposes. The first ones I'm interested in > hooking

Re: OutputMetrics empty for DF writes - any hints?

2017-11-27 Thread Jason White
I think the difference lies somewhere in here: - RDD writes are done with SparkHadoopMapReduceWriter.executeTask, which calls outputMetrics.setRecordsWritten - DF writes are done with InsertIntoHadoopFsRelationCommand.run ? Which I'm not entirely sure how it works. executeTask appears to be run on

Re: OutputMetrics empty for DF writes - any hints?

2017-11-27 Thread Jason White
It doesn't look like the insert command has any metrics in it. I don't see any commands with metrics, but I could be missing something. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e

Does anyone know how to build spark with scala12.4?

2017-11-27 Thread Zhang, Liyun
Hi all: Does anyone know how to build spark with scala12.4? I want to test whether spark can work on jdk9 or not. Scala12.4 supports jdk9. Does anyone try to build spark with scala 12.4 or compile successfully with jdk9.Appreciate to get some feedback from you. Best Regards Kelly Zhang/Zha