Additionally, when an old job completes and I run a new job on the Flink Yarn session mode cluster, when I query for metrics before they become available for the new job, I sometimes get the last metrics for the old job instead. This happens even if I wait for the TaskManager to be released by Flink (as shown in the Flink's dashboard Web UI).
This shouldn't happen since the Task_Manager ID "should" be different, though it would have the old index in the Task_Managers list. Would this be a bug? Thanks! Pankaj On Thu, Dec 12, 2019 at 5:59 AM Pankaj Chand <pankajchanda...@gmail.com> wrote: > Thank you, Chesnay! > > On Thu, Dec 12, 2019 at 5:46 AM Chesnay Schepler <ches...@apache.org> > wrote: > >> Yes, when a cluster was started it takes a few seconds for (any) metrics >> to be available. >> >> On 12/12/2019 11:36, Pankaj Chand wrote: >> >> Hi Vino, >> >> Thank you for the links regarding backpressure! >> >> I am currently using code to get metrics by calling REST API via curl. >> However, many times the REST API via curl gives an empty JSON object/array. >> Piped through JQ (for filtering JSON) it produces a null value. This is >> breaking my code. >> Example in a Yarn cluster session mode, the following metric >> "metrics?get=Status.JVM.CPU.Load" randomly (I think) returns an empty json >> object/array or an actual value. >> >> Is it possible that for CPU Load, the empty JSON object is returned when >> the job is newly started less than 10 seconds ago. >> >> Thanks, >> >> Pankaj >> >> >> >> On Mon, Dec 9, 2019 at 4:21 AM vino yang <yanghua1...@gmail.com> wrote: >> >>> Hi Pankaj, >>> >>> > Is there any sample code for how to read such default metrics? Is >>> there any way to query the default metrics, such as CPU usage and Memory, >>> without using REST API or Reporters? >>> >>> What's your real requirement? Can you use code to call REST API? Why >>> does it not match your requirements? >>> >>> > Additionally, how do I query Backpressure using code, or is it still >>> only visually available via the dashboard UI? Consequently, is there any >>> way to infer Backpressure by querying one (or more) of the Memory metrics >>> of the TaskManager? >>> >>> The backpressure is related to not only memory metrics but also IO and >>> network metrics, for more details about measure backpressure please see >>> this blog.[1][2] >>> >>> [1]: https://flink.apache.org/2019/06/05/flink-network-stack.html >>> [2]: https://flink.apache.org/2019/07/23/flink-network-stack-2.html >>> >>> Best, >>> Vino >>> >>> Pankaj Chand <pankajchanda...@gmail.com> 于2019年12月9日周一 下午12:07写道: >>> >>>> Hello, >>>> >>>> Using Flink on Yarn, I could not understand the documentation for how >>>> to read the default metrics via code. In particular, I want to read >>>> throughput, i.e. CPU usage, Task/Operator's numRecordsOutPerSecond, and >>>> Memory. >>>> >>>> Is there any sample code for how to read such default metrics? Is >>>> there any way to query the default metrics, such as CPU usage and Memory, >>>> without using REST API or Reporters? >>>> >>>> Additionally, how do I query Backpressure using code, or is it still >>>> only visually available via the dashboard UI? Consequently, is there any >>>> way to infer Backpressure by querying one (or more) of the Memory metrics >>>> of the TaskManager? >>>> >>>> Thank you, >>>> >>>> Pankaj >>>> >>> >>