Hi Gabor, That sounds very reasonable. I will put this on my list to try, hopefully in the next few days.
I think the “operator UID and/or state processor API” are good strategies. If we can identify those cases and socialise or provide a utility early that could save a lot of pain during v2 adoption. Thanks for your feedback, Kind regards, David. From: Gabor Somogyi <gabor.g.somo...@gmail.com> Date: Monday, 11 November 2024 at 20:13 To: dev@flink.apache.org <dev@flink.apache.org> Cc: Yuan Mei <yuanmei.w...@gmail.com> Subject: [EXTERNAL] Re: Flink V1 to V2 state migration Hi David, I think the first task would be to test and collect what is not working when migrating from 1.20 to 2.0-preview. To see such issues in advance would be really awesome, since it would ease the fixing/adaptation. When we have such list we can discuss how severe are they and how to address them. My first shallow opinion is the same like Zakelly, namely majority of the issues can be fixed by either providing operator UID and/or state processor API. The migration between any kind of state backends is the savepoint in canonical format so I personally don't see any further path needs to be built. All in all I would create a checkpoint with 1.20 where all the state types are written (this can be quite extensive so maybe try a couple first) and restore it with 2.0-preview. Sooner or later we're going to face all the issues but the sooner we see them the better. G On Mon, Nov 11, 2024 at 6:31 PM David Radley <david_rad...@uk.ibm.com> wrote: > Hello Zakelly, > Thank you very much for your responses. > > I was wondering: > I was thinking that we could have a migration utility: > > 1. You mention “However, manually specifying the operator ids can avoid > this problem.” I assume this is something we could add to the migration > utility. Is this an issue only if they change state backends? > 2. The SQL jobs are using auto-assigned operator ids, I'm not sure if > it still generates the ids in the same way as previously. Is this id > something that is state backend specific, or a change in the SQL / table > planner layer? > 3. When you say the serializers might not be compatible, are we talking > about changing state backend or in general between v1 and v2? > 4. You say. “If you are using newly introduced functions in 2.0, such > as the disaggregated state, the state cannot be migrated.” I assume users > would migrate the state from V1 which would not utilize any v2 > functionality. Then question is how do you migrate from RocksDB to the new > backend at V2? > > @Yuan Mei<mailto:yuanmei.w...@gmail.com> You mentioned “mostly we want > forstdb state backend to stablized and makes it efficient at the first. I > wonder what the timescale is likely to be for this and how migration > consideration will effect this. I suspect if the forstdb APIs are > stabilized then a migration utility could work against them at the same > time as the implementation is made more efficient. Are the forstdb APIs > likely to change or are they pretty stabilize. > > In terms migration activities I am seeing: > > 1. conduct migration tests from 1.20 to 2.0-preview. Is there a set of > state topologies / permutations that would be good foundation for testing > the state? I guess we are thinking RocksDB v1 to v2. > 2. Facilitating RocksDB v1 to v2 state migration where possible > 3. Then to encourage adoption of forstdb state backend, then we should > look at RockDB to forstdb state migration. > > WDYT? > > > Kind regards, David. > > > > > From: Zakelly Lan <zakelly....@gmail.com> > Date: Monday, 11 November 2024 at 09:31 > To: dev@flink.apache.org <dev@flink.apache.org> > Subject: [EXTERNAL] Re: Flink V1 to V2 state migration > Hi Alexis, > > If you want to utilize the disaggregated state in Flink 2.0, you will need > to configure it to use the newly introduced ForSt Statebackend. We don't > support migrating states via checkpoints from one state backend to another. > However, it is possible to migrate state across state backends via > savepoints in canonical format, once we complete the savepoint support for > ForSt. This may take some time. > > In 2.0, you can still use these existing state backends, rocksdb or > hashmap, and thus the disaggregated state is disabled. These options will > remain across 2.x releases. > > > Best, > Zakelly > > On Mon, Nov 11, 2024 at 5:05 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > > > Hi Zakelly, > > > > For point 4, can you clarify two things: > > > > - Is that incompatibility expected to be temporary, or will such a > > migration never be possible? > > - Will it be possible to configure the backend so that disaggregated > state > > is not used? I.e. keeping the current logic. And will this choice remain > > for the whole Flink 2.x release? > > > > Regards, > > Alexis. > > > > Am Mo., 11. Nov. 2024 um 07:22 Uhr schrieb Zakelly Lan < > > zakelly....@gmail.com>: > > > > > Hi David and Galen, > > > > > > To the best of my knowledge, the state or checkpoint itself has not > > changed > > > much (from 1.20 to 2.0), and will be compatible if you are using the > same > > > statebackend. But there still some problems: > > > > > > 1. Due to numerous API breaking changes in 2.0, users may need to > rewrite > > > their job, resulting in auto-assigned operator ids changing. This means > > > that the previous state cannot be mapped to the operators and restored. > > > However, manually specifying the operator ids can avoid this problem. > > > 2. The SQL jobs are using auto-assigned operator ids, I'm not sure if > it > > > still generates the ids in the same way as previously. > > > 3. The serializers might not be compatible. Fortunately, most of the > > > built-in serializers are IIUC compatible. > > > 4. If you are using newly introduced functions in 2.0, such as the > > > disaggregated state, the state cannot be migrated. > > > > > > For 1,2 and 3, I believe the State Processor API[1] can help users > > migrate > > > their state. While we cannot guarantee that Flink V1 state can be > > migrated > > > to V2, in many cases the state can be successfully migrated. Also, the > > > newer the Flink version is, the easier migration would be. It would be > > > helpful for someone to conduct migration tests from 1.20 to 2.0-preview > > and > > > see if there are more issues on this. > > > > > > [1] > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/libs/state_processor_api/ > > > > > > > > > Best, > > > Zakelly > > > > > > On Sat, Nov 9, 2024 at 12:48 AM Galen Warren > > > <ga...@cvillewarrens.com.invalid> wrote: > > > > > > > Yes, I'd like to second that it would be very helpful to have a way > to > > > > migrate state from V1 to V2. > > > > > > > > On Thu, Nov 7, 2024 at 12:11 PM David Radley < > david_rad...@uk.ibm.com> > > > > wrote: > > > > > > > > > Hello, > > > > > At Flink Forward I learnt that Fink V1 state could not be migrated > to > > > V2. > > > > > I think this would be a big migration inhibitor for current Flink > > > users, > > > > as > > > > > they would need to throw away their existing state. As such I think > > > this > > > > is > > > > > a critical possible blocking issue. Prior to Flink 2 going out, I > > think > > > > > this needs to be looked at in case we need to amend the v2 state in > > > some > > > > > way to facilitate easier migration. > > > > > > > > > > Is this a task that is already in hand in the community? If not, I > am > > > > > happy to contribute this or be involved with the contribution – but > > > would > > > > > need guidance from the Flink PMCs on the approach, > > > > > Kind regards, David. > > > > > > > > > > Unless otherwise stated above: > > > > > > > > > > IBM United Kingdom Limited > > > > > Registered in England and Wales with number 741598 > > > > > Registered office: Building C, IBM Hursley Office, Hursley Park > Road, > > > > > Winchester, Hampshire SO21 2JN > > > > > > > > > > > > > > > > Unless otherwise stated above: > > IBM United Kingdom Limited > Registered in England and Wales with number 741598 > Registered office: Building C, IBM Hursley Office, Hursley Park Road, > Winchester, Hampshire SO21 2JN > Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN