Hi Guido,

sorry for the late reply. You were collecting the stats every 1 second.
Afaik, Flink is internally collecting the stats with a frequency of 5
seconds, so you can either change your or Flink's polling interval (I think
its taskmanager.heartbeat-interval)

Regarding the details on PS-Scavenge, MarkSweep etc.: We just use the names
the Java management beans return, so you can just google for the names and
read how to interpret them. For example:
http://www.ibm.com/developerworks/library/j-jtp11253/

The load is the operating system load.



On Thu, Feb 4, 2016 at 10:25 PM, Guido <gmazza...@gmail.com> wrote:

> Hello,
>
> I have few questions regarding garbage collector’s stats on Taskmanagers
> and any help or further documentation would be great.
> I have collected “1 second polling requesting" stats on 7 Taskmanagers,
> through the relative request (/taskmanagers/<idtaskmanager>/) of the
> Monitoring REST API  while a job, that overall took 38 seconds, was
> running.
>
> This way got 38 records for each TaskManager and focusing on garbage
> collector’s stats I can see, for example on 1 of the 38th records:
>
> - PS-Scavenge.Time: 2597, PS-MarkSweep.Time: 29016;
> 1. Is It correct to assume they represent the total elapsed time on
> different GCs (respectively young and old gen)? So, I basically got a
> running sum distribution?
> 2. If yes, values are in mills, so 29 sec?
>
> 3. Could they be used to get how much time has been wasted in total
> because of the “Stop-the-world” GCs policy?
>
> Finally, on the same record:
>
> - PS-Scavenge.Count: 3, PS-MarkSweep.Time: 5, load: 3.73.
>
> 4. Is it the “load” value tightly related?
>
> Sorry if it has been quite long and thanks a lot.
>
> Guido
>
>
>

Reply via email to