So
state:
store the result of some operator(such as keyby,map)

Checkpoint:
store the last result when the program is running OK.


Am I right?
Thanks for your help~!




------------------ 原始邮件 ------------------
发件人:                                                                            
                                            "Congxian Qiu"                      
                                                              
<qcx978132...@gmail.com&gt;;
发送时间:&nbsp;2020年10月13日(星期二) 中午1:32
收件人:&nbsp;"大森林"<appleyu...@foxmail.com&gt;;
抄送:&nbsp;"Arvid Heise"<ar...@ververica.com&gt;;"Shengkai 
Fang"<fskm...@gmail.com&gt;;"user"<user@flink.apache.org&gt;;
主题:&nbsp;Re: why we need keyed state and operate state when we already have 
checkpoint?



Hi&nbsp; &nbsp; As others said, state is different as checkpoint.&nbsp; a 
checkpoint is just a **snapshot** of the state, and you can restore from the 
previous checkpoint if the job crashed.&nbsp;
&nbsp; &nbsp;&nbsp;
&nbsp; &nbsp; state is for stateful computation, and checkpoint is for 
fault-tolerant[1]


&nbsp; &nbsp; The state keeps the information you'll need in the future. Take 
wordcount as an example, the count of the word depends on the total count of 
the word we have seen, we need to keep the "total count of the word have seen 
before" somewhere, in Flink you can keep it in the state.
&nbsp; &nbsp; checkpoint/savepoint contains the **snapshot** of all the state, 
if there is not state, then the checkpoint will be *empty*, you can restore 
from it, but the content is empty.


&nbsp; &nbsp; PS: maybe you don't create state explicit, but there contain some 
states in Flink(such as WindowOperator)


[1]&nbsp;https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/stateful-stream-processing.html
Best,

Congxian









大森林 <appleyu...@foxmail.com&gt; 于2020年10月12日周一 下午9:26写道:

Thanks for your replies.
When I use no state-relevant code in my program,the checkingpoint can be saved 
and resumed.❶


So then why we need&nbsp;Keyed State/Operator State/Stateful Function?❷
"the operators are reset to the time of the respective checkpoint."
We already have met the requirement:"resume from checkpoint(last state of each 
operator which store the result)"❶,
why we still need&nbsp;❷?
Thanks for your help~!






------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                            
                                            "Arvid Heise"                       
                                                             
<ar...@ververica.com&gt;;
发送时间:&nbsp;2020年10月12日(星期一) 下午2:53
收件人:&nbsp;"大森林"<appleyu...@foxmail.com&gt;;
抄送:&nbsp;"Shengkai Fang"<fskm...@gmail.com&gt;;"user"<user@flink.apache.org&gt;;
主题:&nbsp;Re: why we need keyed state and operate state when we already have 
checkpoint?



Hi 大森林,


You can always resume from checkpoints independent of the usage of keyed or 
non-keyed state of operators.
1 checkpoint contains the state of all operators at a given point in time. Each 
operator may have keyed state, raw state, or non-keyed state. 

As long as you are not changing the operators (too much) before restarting, you 
can always restart.


During (automatic) restart of a Flink application, the state of a given 
checkpoint is restored to the operators, such that it looks like the operator 
never failed. However, the operators are reset to the time of the respective 
checkpoint.



I have no clue what you mean with "previous variable temporary result".


On Wed, Oct 7, 2020 at 9:13 AM 大森林 <appleyu...@foxmail.com&gt; wrote:

Thanks for your replies,I have some understandings.


There are two cases.
1. if I use no keyed state in program,when it's killed,I can only resume from 
previous result
1. if I use&nbsp; &nbsp; &nbsp; keyed state in program,when it's killed,I 
can&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;resume from previous result and previous 
variable temporary result.


Am I right?
Thanks for your guide.




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                            
                                            "Arvid Heise"                       
                                                             
<ar...@ververica.com&gt;;
发送时间:&nbsp;2020年10月7日(星期三) 下午2:25
收件人:&nbsp;"大森林"<appleyu...@foxmail.com&gt;;
抄送:&nbsp;"Shengkai Fang"<fskm...@gmail.com&gt;;"user"<user@flink.apache.org&gt;;
主题:&nbsp;Re: why we need keyed state and operate state when we already have 
checkpoint?



I think there is some misunderstanding here: a checkpoint IS (a snapshot of) 
the keyed state and operator state (among a few more things). [1]


[1]&nbsp;https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/fault_tolerance.html#definitions


On Wed, Oct 7, 2020 at 6:51 AM 大森林 <appleyu...@foxmail.com&gt; wrote:

when the job is killed,state is also misssing.
so why we need keyed state?Is keyed state useful when we try to resuming the 
killed job?




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                            
                                            "Shengkai Fang"                     
                                                               
<fskm...@gmail.com&gt;;
发送时间:&nbsp;2020年10月7日(星期三) 中午12:43
收件人:&nbsp;"大森林"<appleyu...@foxmail.com&gt;;
抄送:&nbsp;"user"<user@flink.apache.org&gt;;
主题:&nbsp;Re: why we need keyed state and operate state when we already have 
checkpoint?



The checkpoint is a snapshot for the job and we can resume the job if the job 
is killed unexpectedly. The state is another thing to memorize the intermediate 
result of calculation. I don't think the checkpoint can replace state.

大森林 <appleyu...@foxmail.com&gt; 于2020年10月7日周三 下午12:26写道:

Could you tell me:


why we need keyed state and operator state when we already have checkpoint?

when a running jar crash,we can resume from the checkpoint 
automatically/manually.
So why did we still need keyed state and operator state.


Thanks





-- 

Arvid Heise | Senior Java Developer




Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng&nbsp;&nbsp;&nbsp; 





-- 

Arvid Heise | Senior Java Developer




Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng&nbsp;&nbsp;&nbsp;

Reply via email to