Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-10-04 Thread Stephan Ewen
I think Josh found a "WIP" bug. The code is very much in flux because of the new feature that allows to change the parallelism with which savepoints are resumed. The "user code class loader" is not yet properly used in the operator state backend when reloading snapshot state. This will be integrat

Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-10-04 Thread Till Rohrmann
Hi Josh, the internal state representation of Kafka sources has been changed recently so that it is now possible to rescale the Kafka sources. That is the reason why the old savepoint which contains the Kafka state in the old representation is not able to be read by the updated Kafka sources. The

Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-10-03 Thread Josh
Hi Stefan, Sorry for the late reply - I was away last week. I've just got round to retrying my above scenario (run my job, take a savepoint, restore my job) using the latest Flink 1.2-SNAPSHOT -- and am now seeing a different exception when restoring the state: 10/03/2016 11:29:02 Job execution

Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-09-22 Thread Stefan Richter
Hi, to me, this looks like you are running into the problem described under [FLINK-4603] : KeyedStateBackend cannot restore user code classes. I have opened a pull request (PR 2533) this morning that should fix this behavior as soon as it is merged into master. Best, Stefan > Am 21.09.2016 um

Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-09-21 Thread Josh
Hi Stephan, Thanks for the reply. I should have been a bit clearer but actually I was not trying to restore a 1.0 savepoint with 1.2. I ran my job on 1.2 from scratch (starting with no state), then took a savepoint and tried to restart it from the savepoint - and that's when I get this exception.

Re: Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-09-21 Thread Stephan Ewen
Hi Josh! The state code on 1.2-SNAPSHOT is undergoing pretty heavy changes right now, in order to add the elasticity feature (change parallelism or running jobs and still maintaining exactly once guarantees). At this point, no 1.0 savepoints can be restored in 1.2-SNAPSHOT. We will try and add com

Flink job fails to restore RocksDB state after upgrading to 1.2-SNAPSHOT

2016-09-21 Thread Josh
Hi, I have a Flink job which uses the RocksDBStateBackend, which has been running on a Flink 1.0 cluster. The job is written in Scala, and I previously made some changes to the job to ensure that state could be restored. For example, whenever I call `map` or `flatMap` on a DataStream, I pass a na