Hi, What is the source you're using in your Job and what filesystem (if any) is it writing to?
Best, Aljoscha > On 5. Sep 2017, at 03:06, Mu Kong <[email protected]> wrote: > > Hi all, > > I have some questions about the experience I had with the save point. > So, last night I found my flink cluster's memory usage seemed wired, so I > decided to > > 1. create a savepoint for the running job(there was only one job running at > the time) > 2. and then cancel the job from web UI > 3. and restart the cluster > > and when I tried to resume the job with the savepoint, there was a > "Truncate did not truncate to right length. Should be 11757 is 56383." > exception. > Because there is also a savepoint being created every 4 a.m. in the > morning, so after I failed to run the job with the savepoint I created > before I canceled the job, I tried to use the 4 a.m. savepoint instead, and > it seemed to work well. > > Then this morning, I noticed there is data lost for the time after I cancel > the job and before I resume the job. > > I thought if I run the job with savepoint created in 4 a.m., it should > start to process data from 4 a.m., or I'm missing something here? > > Also, I didn't add uid to the addSource() function, maybe when I restarted > the cluster the auto-generated id has been changed and that might be the > reason why the recovery didn't go well?
