Re: flink job : TPS drops from 400 to 30 TPS

Ragini Manjaiah Mon, 27 Sep 2021 02:26:30 -0700

please let me know how to check Does RPC response and CPU cost

On Mon, Sep 27, 2021 at 1:19 PM JING ZHANG <beyond1...@gmail.com> wrote:


> Hi,
> Since there is not enough information, you could first check the back
> pressure status of the job [1], find the task which caused the back
> pressure.
> Then try to find out why the task processed data slowly, there are many
> reasons, for example the following reasons:
> (1) Does data skew exist, which means some tasks processed more input data
> than the other tasks?
> (2) Is the CPU cost very high?
> (3) Does RPC response start to slow down？
> (4) If you choose async mode lookup, the LookupJoin operator needs to
> buffer some data into state. Which state backend do you use? Does the state
> backend work fine?
> ...
>
> Would you please provide more information about the job, for example back
> pressure status, input data distribution, async mode or sync mode lookup.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/monitoring/back_pressure/
>
> Best,
> JING ZHANG
>
> Ragini Manjaiah <ragini.manja...@gmail.com> 于2021年9月27日周一 下午2:05写道：
>
>> Hi ,
>> I have a flink real time job which  processes user records via topic and
>> also reading data from hbase acting as a look table . If the look table
>> does not contain required metadata then it queries the external db via api
>> . First 1to 2 hours it works fine without issues, later it drops down
>> drastically to 30 TPS. What are the things I need to look into in such a
>> situation? There are no exceptions caught . how to check the bottle neck
>> area . can some throw some light on this.
>>
>>
>> Thanks & Regards
>> Ragini Manjaiah
>>
>>

Re: flink job : TPS drops from 400 to 30 TPS

Reply via email to