Hi Henry, Thanks for letting us know.
On Thu, Oct 25, 2018 at 7:34 PM 徐涛 <happydexu...@gmail.com> wrote: > Hi Hequn & Kien, > Finally the problem is solved. > It is due to slow sink write. Because the job only have 2 tasks, I check > the backpressure, found that the source has high backpressure, so I tried > to improve the sink write. After that the end to end duration is below 1s > and the checkpoint timeout is fixed. > > Best > Henry > > > 在 2018年10月24日,下午10:43,徐涛 <happydexu...@gmail.com> 写道: > > Hequn & Kien, > Thanks a lot for your help, I will try it later. > > Best > Henry > > > 在 2018年10月24日,下午8:18,Hequn Cheng <chenghe...@gmail.com> 写道: > > Hi Henry, > > @Kien is right. Take a thread dump to see what was doing in the > TaskManager. Also check whether gc happens frequently. > > Best, Hequn > > > On Wed, Oct 24, 2018 at 5:03 PM 徐涛 <happydexu...@gmail.com> wrote: > >> Hi >> I am running a flink application with parallelism 64, I left the >> checkpoint timeout default value, which is 10minutes, the state size is >> less than 1MB, I am using the FsStateBackend. >> The application triggers some checkpoints but all of them fails >> due to "Checkpoint expired before completing”, I check the checkpoint >> history, found that there are 63 subtask acknowledge, but one left n/a, and >> also the alignment duration is quite long, about 5m27s. >> I want to know why there is one subtask does not acknowledge? And >> because the alignment duration is long, what will influent the alignment >> duration? >> Thank a lot. >> >> Best >> Henry > > > >