please let me know how to check Does RPC response and CPU cost On Mon, Sep 27, 2021 at 1:19 PM JING ZHANG <beyond1...@gmail.com> wrote:
> Hi, > Since there is not enough information, you could first check the back > pressure status of the job [1], find the task which caused the back > pressure. > Then try to find out why the task processed data slowly, there are many > reasons, for example the following reasons: > (1) Does data skew exist, which means some tasks processed more input data > than the other tasks? > (2) Is the CPU cost very high? > (3) Does RPC response start to slow down? > (4) If you choose async mode lookup, the LookupJoin operator needs to > buffer some data into state. Which state backend do you use? Does the state > backend work fine? > ... > > Would you please provide more information about the job, for example back > pressure status, input data distribution, async mode or sync mode lookup. > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/monitoring/back_pressure/ > > Best, > JING ZHANG > > Ragini Manjaiah <ragini.manja...@gmail.com> 于2021年9月27日周一 下午2:05写道: > >> Hi , >> I have a flink real time job which processes user records via topic and >> also reading data from hbase acting as a look table . If the look table >> does not contain required metadata then it queries the external db via api >> . First 1to 2 hours it works fine without issues, later it drops down >> drastically to 30 TPS. What are the things I need to look into in such a >> situation? There are no exceptions caught . how to check the bottle neck >> area . can some throw some light on this. >> >> >> Thanks & Regards >> Ragini Manjaiah >> >>