Re: How to ensure that job is restored from savepoint when using Flink SQL

shadowell Tue, 07 Jul 2020 19:50:50 -0700


Hi Fabian,



Thanks for your information!
Actually, I am not clear about the mechanism of auto-generated IDs in Flink SQL 
and the mechanism of how does the operator state mapping back from savepoint.
I hope to get some detail information by giving an example bellow.


I have two sql as samples:
old sql ： select id, name, sum(salary) from user_info where id == '001' group 
by TUMBLE(rowtime, INTERVAL '1' DAY), id, name;
new sql:   select id, name, sum(salary) from user_info where id == '001' and 
age >= '28' group by TUMBLE(rowtime, INTERVAL '1' DAY), id, name; 
I just add some age limitation in new SQL. Now, I want to switch the job from 
old one to the new one by trigger a savepoint. Flink will generate operator IDs 
for operators in new SQL.
In this case, just from a technical point of view,  the operator IDs in the 
savepoint of the old SQL job can match the operator IDs in the new SQL job?
My understanding is that Flink will reorder the operators and generate new IDs 
for operators. The new IDs may not match the old IDs. 
This will cause some states failed to be mapped back from the old job 
savepoint, which naturally leads to inaccurate calculation results.
I wonder if my understanding is correct.


Thanks~ 
Jie


| |
shadowell
|
|
shadow...@126.com
|
签名由网易邮箱大师定制
On 7/7/2020 17:23，Fabian Hueske<fhue...@gmail.com> wrote：
Hi Jie Feng,


As you said, Flink translates SQL queries into streaming programs with 
auto-generated operator IDs.
In order to start a SQL query from a savepoint, the operator IDs in the 
savepoint must match the IDs in the newly translated program.
Right now this can only be guaranteed if you translate the same query with the 
same Flink version (optimizer changes might change the structure of the 
resulting plan even if the query is the same).
This is of course a significant limitation, that the community is aware of and 
planning to improve in the future.


I'd also like to add that it can be very difficult to assess whether it is 
meaningful to start a query from a savepoint that was generated with a 
different query.
A savepoint holds intermediate data that is needed to compute the result of a 
query.
If you update a query it is very well possible that the result computed by 
Flink won't be equal to the actual result of the new query.



Best, Fabian



Am Mo., 6. Juli 2020 um 10:50 Uhr schrieb shadowell <shadow...@126.com>:



Hello, everyone,
        I have some unclear points when using Flink SQL. I hope to get an 
answer or tell me where I can find the answer.
        When using the DataStream API, in order to ensure that the job can 
recover the state from savepoint after adjustment, it is necessary to specify 
the uid for the operator. However, when using Flink SQL, the uid of the 
operator is automatically generated. If the SQL logic changes (operator order 
changes), when the task is restored from savepoint, will it cause some of the 
operator states to be unable to be mapped back, resulting in state loss?


Thanks~
Jie Feng 
| |
shadowell
|
|
shadow...@126.com
|
签名由网易邮箱大师定制

Re: How to ensure that job is restored from savepoint when using Flink SQL

Reply via email to