I'd like to use the SparkListenerInterface to listen for some metrics for
monitoring/logging/metadata purposes. The first ones I'm interested in
hooking into are recordsWritten and bytesWritten as a measure of throughput.
I'm using PySpark to write Parquet files from DataFrames.
I'm able to extrac
Is this due to the insert command not having metrics? It's a problem we
should fix.
On Mon, Nov 27, 2017 at 10:45 AM, Jason White
wrote:
> I'd like to use the SparkListenerInterface to listen for some metrics for
> monitoring/logging/metadata purposes. The first ones I'm interested in
> hooking
I think the difference lies somewhere in here:
- RDD writes are done with SparkHadoopMapReduceWriter.executeTask, which
calls outputMetrics.setRecordsWritten
- DF writes are done with InsertIntoHadoopFsRelationCommand.run ? Which I'm
not entirely sure how it works.
executeTask appears to be run on
It doesn't look like the insert command has any metrics in it. I don't see
any commands with metrics, but I could be missing something.
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e
Hi all:
Does anyone know how to build spark with scala12.4? I want to test whether
spark can work on jdk9 or not. Scala12.4 supports jdk9. Does anyone try to
build spark with scala 12.4 or compile successfully with jdk9.Appreciate to get
some feedback from you.
Best Regards
Kelly Zhang/Zha