Hi Marco, > I assume that all the data within the checkpoint are stored within the given Savepoint. Is that assumption correct? Yes > I have not figured out how to correct / augment / fix the state though. Can somebody please explain? Please try this way. 1. Load old savepoint file, create Savepoint obj1 2. Read state of operator with UID Y in returned Savepoint obj1 by step1 3. Create `BootstrapTransformation` based on entry point class `OperatorTransformation`, bootstrap new operator state with dataset returned by step2, correct or fix old state of operator UID Y in a `StateBotstrapFunction` or `KeyedStateBootstrapFunction` 4. Load old savepoint file, create Savepoint obj2 5. Drop the old operator with UID Y by calling `removeOperator` in returned Savepoint obj2 by step4 6. Add a new Operator with UID Y by calling `withOperator` in returned Savepoint obj2 by step4 , the first parameter is uid (Y), the second parameter is returned `BootstrapTranformation` by step 3. 7. writes out returned Savepoint obj2 by step7 to a new path
In this way, in new savepoint files, states of operator withUIDs: W,X, Z are intact, only the state of operator Y is updated. Detailed about read/write/modify savepoint could be found in document[1] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/docs/libs/state_processor_api/ Best regards, JING ZHANG Marco Villalobos <mvillalo...@kineteque.com> 于2021年6月29日周二 上午6:00写道: > Let's say that a job has operators with UIDs: W, X, Y, and Z, and uses > RocksDB as a backend with checkpoint data URI s3://checkpoints" > > Then I stop the job with a savepoint at s3://savepoint-1. > > I assume that all the data within the checkpoint are stored within the > given Savepoint. Is that assumption correct? > > Then, how can I fix the state in operator with UID Y, but keep all the > data in the other operators intact? > > I know how to bootstrap state with the state-processor API. > I have not figured out how to correct / augment / fix the state though. > > Can somebody please explain? >