Thank you, Chesnay! On Thu, Dec 12, 2019 at 5:46 AM Chesnay Schepler <ches...@apache.org> wrote:
> Yes, when a cluster was started it takes a few seconds for (any) metrics > to be available. > > On 12/12/2019 11:36, Pankaj Chand wrote: > > Hi Vino, > > Thank you for the links regarding backpressure! > > I am currently using code to get metrics by calling REST API via curl. > However, many times the REST API via curl gives an empty JSON object/array. > Piped through JQ (for filtering JSON) it produces a null value. This is > breaking my code. > Example in a Yarn cluster session mode, the following metric > "metrics?get=Status.JVM.CPU.Load" randomly (I think) returns an empty json > object/array or an actual value. > > Is it possible that for CPU Load, the empty JSON object is returned when > the job is newly started less than 10 seconds ago. > > Thanks, > > Pankaj > > > > On Mon, Dec 9, 2019 at 4:21 AM vino yang <yanghua1...@gmail.com> wrote: > >> Hi Pankaj, >> >> > Is there any sample code for how to read such default metrics? Is >> there any way to query the default metrics, such as CPU usage and Memory, >> without using REST API or Reporters? >> >> What's your real requirement? Can you use code to call REST API? Why >> does it not match your requirements? >> >> > Additionally, how do I query Backpressure using code, or is it still >> only visually available via the dashboard UI? Consequently, is there any >> way to infer Backpressure by querying one (or more) of the Memory metrics >> of the TaskManager? >> >> The backpressure is related to not only memory metrics but also IO and >> network metrics, for more details about measure backpressure please see >> this blog.[1][2] >> >> [1]: https://flink.apache.org/2019/06/05/flink-network-stack.html >> [2]: https://flink.apache.org/2019/07/23/flink-network-stack-2.html >> >> Best, >> Vino >> >> Pankaj Chand <pankajchanda...@gmail.com> 于2019年12月9日周一 下午12:07写道: >> >>> Hello, >>> >>> Using Flink on Yarn, I could not understand the documentation for how to >>> read the default metrics via code. In particular, I want to read >>> throughput, i.e. CPU usage, Task/Operator's numRecordsOutPerSecond, and >>> Memory. >>> >>> Is there any sample code for how to read such default metrics? Is there >>> any way to query the default metrics, such as CPU usage and Memory, without >>> using REST API or Reporters? >>> >>> Additionally, how do I query Backpressure using code, or is it still >>> only visually available via the dashboard UI? Consequently, is there any >>> way to infer Backpressure by querying one (or more) of the Memory metrics >>> of the TaskManager? >>> >>> Thank you, >>> >>> Pankaj >>> >> >