Hi Attila, I was configuring metrics.properties by following below steps:
1. *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink master.source.jvm.class=org.apache.spark.metrics.source.JvmSource worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource 1. Restart the spark master workers 2. Connect to monitoring tool using <driverhost>:<driverport> e.g. <host_machine>:4040 But it gives error [cid:image010.png@01D71F1C.3172C730] Any clue what is missed out. Regards Ranju From: Attila Zsolt Piros <piros.attila.zs...@gmail.com> Sent: Monday, March 22, 2021 11:07 AM To: Ranju Jain <ranju.j...@ericsson.com> Cc: Mich Talebzadeh <mich.talebza...@gmail.com>; user@spark.apache.org Subject: Re: Can JVisual VM monitoring tool be used to Monitor Spark Executor Memory and CPU Hi Ranju! I am quite sure for your requirement "monitor every component and isolate the resources consuming individually by every component" Spark metrics is the right direction to go. > Why only UsedstorageMemory should be checked? Right, for you only storage memory won't be enough you need the system and the execution memory too. I expect ".JVMHeapMemory" and ".JVMOffHeapMemory" is what you looking for. > Also I noticed cpuTime provides cpu time spent by an executor. But there is > no metric by which I can calculate the number of cores. Number of cores is specified by the Spark submit. IIRC if you pass 3 it means that each executor can run a maximum of 3 tasks at the same time. So all these cores will be used if there is enough tasks. I know this is not perfect solution but I hope it helps. > Also I see Grafana, a very good visualization tool where I see all the > metrics can be viewed , but I have less idea for steps to install on virtual > server and integrate. I cannot help in this with specifics but a monitoring system is a good idea either Grafana or Prometheus. Best regards, Attila On Sun, Mar 21, 2021 at 3:01 PM Ranju Jain <ranju.j...@ericsson.com<mailto:ranju.j...@ericsson.com>> wrote: Hi Mich/Attila, @Mich Talebzadeh<mailto:mich.talebza...@gmail.com>: I considered spark GUI , but I have a confusion first at memory level. App Configuration: spark.executor.memory= 4g for running spark job. In spark GUI I see running spark job has Peak Execution Memory is 1 KB as highlighted below: I do not have Storage Memory screenshot. So I calculated Total Memory consumption at that point of time was: Spark UI shows : spark.executor.memory= Peak Execution Memory + Storage Mem + Reserved Mem + User Memory = 1 Kb + Storage Mem + 300 Mb + (4g *0.25) = 1 Kb + Storage Mem + 300 Mb + 1g = Approx 1.5 g [cid:image001.png@01D71F1B.122FCEA0][cid:image002.jpg@01D71F1B.122FCEA0] And if I see Executor 0,1,2 actual memory consumption on virtual server using top commnd , it shows below reading: Executor – 2: top [cid:image003.jpg@01D71F1B.122FCEA0] [cid:image004.png@01D71F1B.122FCEA0][cid:image005.png@01D71F1B.122FCEA0] Executor-0 : top [cid:image006.png@01D71F1B.122FCEA0][cid:image007.jpg@01D71F1B.122FCEA0] Please suggest On Spark GUI, Can I go with below formula to isolate that how much spark component is consuming memory out of several other components of a Web application. spark.executor.memory= Peak Execution Memory + Storage Mem + Reserved Mem + User Memory = 1 Kb + Storage Mem + 300 Mb + (4g *0.25) @Attila Zsolt Piros<mailto:piros.attila.zs...@gmail.com>: I checked the memoryMetrics.* of executor-metrics<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#executor-metrics>, but here I have a confusion about usedOnHeapStorageMemory usedOffHeapStorageMemory totalOnHeapStorageMemory totalOffHeapStorageMemory Why only UsedstorageMemory should be checked? To isolate spark.executor.memory, Should I check memoryMetrics.* where only storageMemory is given or Should I check peakMemoryMetrics.* where all Peaks are specified 1. Execution 2. Storage 3. JVM Heap Also I noticed cpuTime provides cpu time spent by an executor. But there is no metric by which I can calculate the number of cores. As suggested, I checked Luca Canali’s presentation, there I see JMXSink which Registers metrics for viewing in JMX Console. I think exposing this metric via JMXSink take it to visualize spark.executor.memory and number of cores by an executor on Java Monitoring tool. Also I see Grafana, a very good visualization tool where I see all the metrics can be viewed , but I have less idea for steps to install on virtual server and integrate. I need to go through in detail the Grafana. Kindly suggest your views. Regards Ranju From: Attila Zsolt Piros <piros.attila.zs...@gmail.com<mailto:piros.attila.zs...@gmail.com>> Sent: Sunday, March 21, 2021 3:42 AM To: Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> Cc: Ranju Jain <ranju.j...@ericsson.com.invalid<mailto:ranju.j...@ericsson.com.invalid>>; user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Can JVisual VM monitoring tool be used to Monitor Spark Executor Memory and CPU Hi Ranju! You can configure Spark's metric system. Check the memoryMetrics.* of executor-metrics<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#executor-metrics> and in the component-instance-executor<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#component-instance--executor> the CPU times. Regarding the details I suggest to check Luca Canali's presentations about Spark's metric system and maybe his github repo<https://protect2.fireeye.com/v1/url?k=fa1cc839-a587f13b-fa1c88a2-869a14f4b08c-e5a43635680659d6&q=1&e=2983b1d0-25d5-49fa-b52a-154388007bd2&u=https%3A%2F%2Fgithub.com%2FLucaCanali%2FsparkMeasure>. Best Regards, Attila On Sat, Mar 20, 2021 at 5:41 PM Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: Hi, Have you considered spark GUI first? [Image removed by sender.] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Sat, 20 Mar 2021 at 16:06, Ranju Jain <ranju.j...@ericsson.com.invalid<mailto:ranju.j...@ericsson.com.invalid>> wrote: Hi All, Virtual Machine running an application, this application is having various other 3PPs components running such as spark, database etc . My requirement is to monitor every component and isolate the resources consuming individually by every component. I am thinking of using a common tool such as Java Visual VM , where I specify the JMX URL of every component and monitor every component. For other components I am able to view their resources. Is there a possibility of Viewing the Spark Executor CPU/Memory via Java Visual VM Tool? Please guide. Regards Ranju