Hi Marco You need to figure out why the checkpoint timed out(you can see the consumed time of each period for one checkpoint in UI), if it indeed needs such long time to complete the checkpoint, then you need to configure a longer timeout. If there are some checkpoint errors, we need first to figure out what the problem is, in general, a checkpoint can split into some parts such as barrie alignment(maybe there is some backpressure or something else, that some barrier can't be received in time), sync duration(the thread is too busy ...), and async duration(too much io/network process ...), etc.
Best, Congxian Marco Villalobos <mvillalo...@kineteque.com> 于2021年1月29日周五 上午7:19写道: > I am kind of stuck in determining how large a checkpoint interval should > be. > > Is there a guide for that? If a timeout time is 10 minutes, we time out, > what is a good strategy for adjusting that? > > Where is a good starting point for a checkpoint? How shall they be > adjusted? > > We often see checkpoint errors during our onTimer calls, I don't know if > that's related. > > Marco A. Villalobos > > >