Re: Slow flink checkpoint

2018-04-16 Thread Fabian Hueske
Hi everybody, Thanks so much for looking into this issue and posting the detailed description of your approach. As said before, improving the checkpointing performance for timers is a very important improvement for Flink. I'm not familiar with the internals of the timer service checkpointing, but

Re: Slow flink checkpoint

2018-04-16 Thread makeyang
since flink forward SF has done. can you guys give some minutes to take a look at this issue and give some thoughts on it? help to review/comments on my desgin? or give us a design so that I can help to implement it. thanks a lot. -- Sent from: http://apache-flink-user-mailing-list-archive.2336

Re: Slow flink checkpoint

2018-04-15 Thread 林德强
Hi Stefan , Fabian , Keyang is engineer in our team, he has do a lot of efforts on the timers' snapshot async. What do you think of his idea? Best, Deqiang TIG.JD.COM > 在 2018年4月1日,下午7:21,makeyang 写道: > > I have put a lot of efforts on this issue and try

Re: Slow flink checkpoint

2018-04-04 Thread makeyang
the test is very promising. the time sync part takes from couple of seconds to couple of mill-seconds. 1000x time reduce(overall time not save since it is just move from sync to async) are u guys interested in this change? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.na

Re: Slow flink checkpoint

2018-04-01 Thread makeyang
I have put a lot of efforts on this issue and try to resolve it: 1. let me describe current timers' snapshot path first: a) for each keygroup, invoke InternalTimeServiceManager.snapshotStateForKeyGroup b) InternalTimeServiceManager create a InternalTimerServiceSerializationProxy to write sn

Re: Slow flink checkpoint

2018-03-19 Thread Fabian Hueske
Hi, Yes, you cannot start a separate thread to cleanup the state. State is managed by Flink and can only be accessed at certain points in time when the user code is called. If you are using event time, another trick you could play is to only register all timers on (currentWatermark + 1). That wil

Re: Slow flink checkpoint

2018-03-16 Thread 林德强
Hi Fabian , Reduce the number of timers is a good idea. But in my application the timer is different from the key registered follow the keyBy . May be it can't work with an upper and lower bound. I try modify the flink resource and start a thread to clean the expired keyed sate, but it d

Re: Slow flink checkpoint

2018-03-16 Thread Stefan Richter
Hi, yes, that is correct, the timer service is currently only available in main-memory and only with synchronous snapshots. this topic is on our TODO list for after the Flink 1.5 release. Best, Stefan > Am 16.03.2018 um 09:03 schrieb Fabian Hueske : > > Hi, > > AFAIK, that's not possible. >

Re: Slow flink checkpoint

2018-03-16 Thread Fabian Hueske
Hi, AFAIK, that's not possible. The only "solution" is to reduce the number of timers. Whether that's possible or not, depends on the application. For example, if you use timers to clean up state, you can work with an upper and lower bound and only register one timer for each (upper - lower) inter