Re: Savepoint a failing job

2022-12-26 Thread Timothy Bess
Hi Hangxiang and Martijn, Thanks for the tips! That state processor API looks interesting, I'll have to dig into that more. The point about the query plan makes sense, but it also means that using SQL in my job end-to-end is a little dangerous if I need to guarantee that I'll never have to dump st

Re: Savepoint a failing job

2022-12-22 Thread Martijn Visser
Hi Tim, > Our job happens to be stateless, so we're okay this time, but if we had used state (like joining two streams or something) we would end up losing data to fix this bug. Is the only solution to just use the DataStream API? In case you have a change in your SQL statement, then yes you woul

Re: Savepoint a failing job

2022-12-21 Thread Hangxiang Yu
Hi Tim. > Is the only solution to just use the DataStream API? Just as Martijn mentioned, if the execution plan has been changed, it's difficult to reuse the original state to restore. Only if you are dropping some operators, then you could use -- allowNonRestoredState to restore withouting droppin

Re: Savepoint a failing job

2022-12-21 Thread Timothy Bess
Hi Martijn, Sorry I didn't see your response! Basically we had a bad event that was blowing up our python UDF, so we wanted to change the SQL to add a where clause that filters out the event to mitigate the issue. Our job happens to be stateless, so we're okay this time, but if we had used state (

Re: Savepoint a failing job

2022-12-16 Thread Martijn Visser
Hi Tim, If I understand correctly, you need to deploy a new SQL statement in order to fix your issue? If so, the problem is that a new SQL statement might lead to a different execution plan which can't be restored. See https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/ov