Hi, Siew Wai Yow Yes, David is correct, the TM must be recovered, the number of TMs before and after the crash must be the same. In my last reply, I want to say that the states may not on the same TM after the crash. Sorry for the unclear description.
Siew Wai Yow <wai_...@hotmail.com> 于2019年1月12日周六 下午6:44写道: > Thanks Qiu but David has different view from stackoverflow. He mentioned > the Crashed TM must be recovered. > > > https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash/54153686?noredirect=1#comment95144500_54153686 > > "The crashed TM must be recovered, and state updates since the last > checkpoint will be lost. Of course that lost state should be recreated as > the job rewinds and resumes." > > Hi, Siew Wai Yow > > When the job is running, the states are stored in the local RocksDB, Flink > will copy all the needed states to checkpointPath when doing a Checkpoint. > If there have any failures, the job will be restored from the last > previously *Successfully* checkpoint, and assign the restored states to all > the current TM > (These TMs do not need to be the same as before) . > > Siew Wai Yow <wai_...@hotmail.com> 于2019年1月12日周六 上午11:24写道: > >> Thanks. But this is something I know. I would like to know will the other >> TM take over the crashed TM's state to ensure data completion(say the state >> BYKEY, different key state will be stored in different TM) OR the crashed >> TM need to be recovered to continue? >> >> For example, 5 records, >> rec1:KEYA >> rec2:KEYB >> rec3:KEYA >> rec4:KEYC >> rec5:KEYB >> >> TM1 stored state for rec1:KEYA, rec3:KEYA >> TM2 stored state for rec2:KEYB, rec5:KEYB <--crashed, what happen to >> those state stored inTM2 >> TM3 stored state for rec4:KEYC >> >> In case TM2 crashed, rec2 and rec5 will be assigned to other TM? Or those >> record only recover when TM2 being recover? >> >> Thanks. >> >> >> ------------------------------ >> *From:* Jamie Grier <jgr...@lyft.com> >> *Sent:* Saturday, January 12, 2019 2:26 AM >> *To:* Siew Wai Yow >> *Cc:* user@flink.apache.org >> *Subject:* Re: What happen to state in Flink Task Manager when crash? >> >> Flink is designed such that local state is backed up to a highly >> available system such as HDFS or S3. When a TaskManager fails state is >> recovered from there. >> >> I suggest reading this: >> <https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html> >> https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html >> >> >> On Fri, Jan 11, 2019 at 7:33 AM Siew Wai Yow <wai_...@hotmail.com> wrote: >> >> Hello, >> >> May i know what happen to state stored in Flink Task Manager when this >> Task manager crash. Say the state storage is rocksdb, would those data >> transfer to other running Task Manager so that complete state data is ready >> for data processing? >> >> Regards, >> Yow >> >> > > -- > Best, > Congxian > > -- Best, Congxian