Hi All, I am trying to set up monitoring to better understand the performance bottlenecks of my Spark application. I have some questions:
1. BlockManager.disk.diskSpaceUsed_MB is always zero when I go to http://localhost:4040/metrics/json/ Even though I know that blockmanager is using a lot of disk space, e.g. because the Linux command du -msc /tmp/blockmgr-bee83574-d958-4ef0-aaa7-a45f5012bdff # my current blockmanager directory returns a large non-zero number. 2. I am really curious how much memory space are my cached RDDs taking. Are there any metrics related to that? I can't see any in http://localhost:4040/metrics/json/, even though the web UI does show some related numbers, e.g. under the storage tab. I am running Spark in local mode, maybe that is a problem? Or am I looking at a wrong metrics endpoint? Thanks, Gabor p.s.: Some more details: This is my metrics.properties file: ``` *.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink *.sink.graphite.host=localhost *.sink.graphite.port=9109 *.sink.graphite.period=1 *.sink.graphite.unit=seconds master.source.jvm.class=org.apache.spark.metrics.source.JvmSource worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource ``` I want to use Graphite+Prometheus+Grafana to view my metrics in the end, but that's not important for now. If I could get what I want in the JSON dumps, I will already be happy! And this is how I am starting my app: ``` ~/spark-1.6.0/bin/spark-submit \ --conf spark.metrics.conf=metrics.properties \ --class "SimpleApp" \ --master "local[4]" \ --driver-memory 16g \ target/scala-2.10/simple-project_2.10-1.0.jar ```
