Re: Checkpoint acknowledge takes too long

Hequn Cheng Thu, 25 Oct 2018 07:23:55 -0700

Hi Henry,

Thanks for letting us know.


On Thu, Oct 25, 2018 at 7:34 PM 徐涛 <happydexu...@gmail.com> wrote:

> Hi Hequn & Kien,
> Finally the problem is solved.
> It is due to slow sink write. Because the job only have 2 tasks, I check
> the backpressure, found that the source has high backpressure, so I tried
> to improve the sink write. After that the end to end duration is below 1s
> and the checkpoint timeout is fixed.
>
> Best
> Henry
>
>
> 在 2018年10月24日，下午10:43，徐涛 <happydexu...@gmail.com> 写道：
>
> Hequn & Kien,
> Thanks a lot for your help, I will try it later.
>
> Best
> Henry
>
>
> 在 2018年10月24日，下午8:18，Hequn Cheng <chenghe...@gmail.com> 写道：
>
> Hi Henry,
>
> @Kien is right. Take a thread dump to see what was doing in the
> TaskManager. Also check whether gc happens frequently.
>
> Best, Hequn
>
>
> On Wed, Oct 24, 2018 at 5:03 PM 徐涛 <happydexu...@gmail.com> wrote:
>
>> Hi
>>         I am running a flink application with parallelism 64, I left the
>> checkpoint timeout default value, which is 10minutes, the state size is
>> less than 1MB, I am using the FsStateBackend.
>>         The application triggers some checkpoints but all of them fails
>> due to "Checkpoint expired before completing”, I check the checkpoint
>> history, found that there are 63 subtask acknowledge, but one left n/a, and
>> also the alignment duration is quite long, about 5m27s.
>>         I want to know why there is one subtask does not acknowledge? And
>> because the alignment duration is long, what will influent the alignment
>> duration?
>>         Thank a lot.
>>
>> Best
>> Henry
>
>
>
>

Re: Checkpoint acknowledge takes too long

Reply via email to