Hi Attila,

I was configuring metrics.properties  by following below steps:


  1.  *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource


  1.  Restart the spark master workers
  2.  Connect to monitoring tool using  <driverhost>:<driverport> e.g.   
<host_machine>:4040



But it gives error

[cid:image010.png@01D71F1C.3172C730]



Any clue what is missed out.


Regards
Ranju


From: Attila Zsolt Piros <piros.attila.zs...@gmail.com>
Sent: Monday, March 22, 2021 11:07 AM
To: Ranju Jain <ranju.j...@ericsson.com>
Cc: Mich Talebzadeh <mich.talebza...@gmail.com>; user@spark.apache.org
Subject: Re: Can JVisual VM monitoring tool be used to Monitor Spark Executor 
Memory and CPU

Hi Ranju!

I am quite sure for your requirement "monitor every component and isolate the 
resources consuming individually by every component" Spark metrics is the right 
direction to go.

> Why only UsedstorageMemory should be checked?

Right, for you only storage memory won't be enough you need the system and the 
execution memory too.
I expect ".JVMHeapMemory" and ".JVMOffHeapMemory" is what you looking for.

> Also I noticed cpuTime provides cpu time spent by an executor. But there is 
> no metric by which I can calculate the number of cores.

Number of cores is specified by the Spark submit. IIRC if you pass 3 it means 
that each executor can run a maximum of 3 tasks at the same time.
So all these cores will be used if there is enough tasks. I know this is not 
perfect solution but I hope it helps.

> Also I see Grafana, a very good visualization tool where I see all the 
> metrics can be viewed , but I have less idea for steps to install on virtual 
> server and integrate.

I cannot help in this with specifics but a monitoring system is a good idea 
either Grafana or Prometheus.

Best regards,
Attila

On Sun, Mar 21, 2021 at 3:01 PM Ranju Jain 
<ranju.j...@ericsson.com<mailto:ranju.j...@ericsson.com>> wrote:
Hi Mich/Attila,

@Mich Talebzadeh<mailto:mich.talebza...@gmail.com>: I considered spark GUI , 
but I have a confusion first at memory level.

App Configuration: spark.executor.memory= 4g for running spark job.

In spark GUI I see running spark job has Peak Execution Memory is 1 KB as 
highlighted below:
I do not have Storage Memory screenshot. So  I calculated Total Memory 
consumption at that point of time was:

Spark UI shows :  spark.executor.memory= Peak Execution Memory + Storage Mem + 
Reserved Mem + User Memory
                                                                             = 
1 Kb + Storage Mem + 300 Mb + (4g *0.25)
                                                                                
   = 1 Kb + Storage Mem + 300 Mb + 1g
                                                                                
  = Approx 1.5 g


[cid:image001.png@01D71F1B.122FCEA0][cid:image002.jpg@01D71F1B.122FCEA0]

And if I see Executor 0,1,2 actual memory consumption on virtual server using 
top  commnd , it shows below reading:

Executor – 2:       top
  [cid:image003.jpg@01D71F1B.122FCEA0]
[cid:image004.png@01D71F1B.122FCEA0][cid:image005.png@01D71F1B.122FCEA0]

Executor-0 :    top
[cid:image006.png@01D71F1B.122FCEA0][cid:image007.jpg@01D71F1B.122FCEA0]

Please suggest On Spark GUI, Can I go with below formula to isolate that how 
much spark component is consuming  memory out of several other components of a 
Web application.
  spark.executor.memory= Peak Execution Memory + Storage Mem + Reserved Mem + 
User Memory
                                                  = 1 Kb + Storage Mem + 300 Mb 
+ (4g *0.25)


@Attila Zsolt Piros<mailto:piros.attila.zs...@gmail.com>: I checked the 
memoryMetrics.* of 
executor-metrics<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#executor-metrics>,
 but here I have a confusion about
usedOnHeapStorageMemory
usedOffHeapStorageMemory
totalOnHeapStorageMemory
totalOffHeapStorageMemory

Why only UsedstorageMemory should be checked?

To isolate spark.executor.memory, Should I check memoryMetrics.* where only 
storageMemory is given  or Should I check peakMemoryMetrics.* where all Peaks 
are specified

  1.  Execution
  2.  Storage
  3.  JVM Heap

Also I noticed cpuTime provides cpu time spent by an executor. But there is no 
metric by which I can calculate the number of cores.

As suggested, I checked Luca Canali’s presentation, there I see JMXSink which 
Registers metrics for viewing in JMX Console. I think exposing this metric via 
JMXSink take it to visualize
spark.executor.memory and number of cores by an executor on Java Monitoring 
tool.
Also I see Grafana, a very good visualization tool where I see all the metrics 
can be viewed , but I have less idea for steps to install on virtual server and 
integrate. I need to go through in detail the Grafana.

Kindly suggest your views.

Regards
Ranju

From: Attila Zsolt Piros 
<piros.attila.zs...@gmail.com<mailto:piros.attila.zs...@gmail.com>>
Sent: Sunday, March 21, 2021 3:42 AM
To: Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>>
Cc: Ranju Jain 
<ranju.j...@ericsson.com.invalid<mailto:ranju.j...@ericsson.com.invalid>>; 
user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Can JVisual VM monitoring tool be used to Monitor Spark Executor 
Memory and CPU

Hi Ranju!

You can configure Spark's metric system.

Check the memoryMetrics.* of 
executor-metrics<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#executor-metrics>
 and in the 
component-instance-executor<https://spark.apache.org/docs/3.0.0-preview/monitoring.html#component-instance--executor>
 the CPU times.

Regarding the details I suggest to check Luca Canali's presentations about 
Spark's metric system and maybe his github 
repo<https://protect2.fireeye.com/v1/url?k=fa1cc839-a587f13b-fa1c88a2-869a14f4b08c-e5a43635680659d6&q=1&e=2983b1d0-25d5-49fa-b52a-154388007bd2&u=https%3A%2F%2Fgithub.com%2FLucaCanali%2FsparkMeasure>.

Best Regards,
Attila

On Sat, Mar 20, 2021 at 5:41 PM Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Hi,

Have you considered spark GUI first?



 [Image removed by sender.]   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Sat, 20 Mar 2021 at 16:06, Ranju Jain 
<ranju.j...@ericsson.com.invalid<mailto:ranju.j...@ericsson.com.invalid>> wrote:
Hi All,

Virtual Machine running an application, this application is having various 
other 3PPs components running such as spark, database etc .

My requirement is to monitor every component and isolate the resources 
consuming individually by every component.

I am thinking of using a common tool such as Java Visual VM , where I specify 
the JMX URL of every component and monitor every component.

For other components I am able to view their resources.

Is there a possibility of Viewing the Spark Executor CPU/Memory via Java Visual 
VM Tool?

Please guide.

Regards
Ranju

Reply via email to