Re: [DISCUSS] State Schema Evolution for RowData

2025-08-19 Thread Gabor Somogyi
Hi Weiqing, I've just read through the whole FLIP and +1 on the direction. I've a comment apart from the other pending items. Namely the configuration is `state.schema-evolution.enable` which implied to me that it's a generic state evolution feature but it's limited to Row data. Maybe we can mar

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-31 Thread Weiqing Yang
Hi Shengkai, Hongshun, I’ve added one of our job examples here to help illustrate the schema evolution scenario in practice: Example Job (Note: Some SQL logic and schema details are redacted.) @Shengka

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-23 Thread Hongshun Wang
Hi Weiqing, I like the idea. Would you give an example how a kafka or mysql connector uses it to read data with different schemas for better understanding? Best, Hongshun On Thu, Jul 24, 2025 at 10:34 AM Shengkai Fang wrote: > Hi, Weiqing. > > Thanks for the update. It's better you can update t

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-23 Thread Shengkai Fang
Hi, Weiqing. Thanks for the update. It's better you can update the cwiki rather than google doc. After reading the doc, I just feel this feature is not applicable for users because few users understand the sql operator state structure and in some cases, the operator state structure is releated to

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-22 Thread Weiqing Yang
Hi Shengkai, Thank you for your detailed feedback! I've updated the proposal to incorporate your suggestions: 1. fieldNames vs. originalRowType: I agree that using fieldNames: String[] is a more focused and lightweight approach than storing the full originalRowType. The proposal has bee

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-20 Thread Shengkai Fang
Hi Weiqing. +1 for the FLIP. I have some suggestions about the FLIP: 1. Compared to adding a field named originalRowType in `RowDataSerializer`, I prefer to add a field named fieldNames with type String[] . WDYT? I think this field is used for name-based field mapping, so we just add the required

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-17 Thread Weiqing Yang
Hi Zakelly and Hangxiang, Just checking in - do you have any concerns or feedback? If there are no further objections from anyone, I’ll mark the FLIP as ready for voting. Best, Weiqing On Sun, Jul 6, 2025 at 11:46 PM Weiqing Yang wrote: > Hi Hangxiang, Zakelly, > > Thank you for the careful

Re: [DISCUSS] State Schema Evolution for RowData

2025-07-06 Thread Weiqing Yang
Hi Hangxiang, Zakelly, Thank you for the careful review and the +1 on the proposal. *1. Where to host the migration logic* I experimented with placing the migration hook on TypeSerializerSchemaCompatibility, but ran into two issues: - Is the "schemaEvolutionSerializer" intended to be the

Re: [DISCUSS] State Schema Evolution for RowData

2025-05-06 Thread Hangxiang Yu
Hi, Weiqing. Thanks for driving this FLIP. I'm +1 for supporting schema evolution for SQL RowData type. I just have some questions: 1. Could we consider defining a method returning *SchemaEvolutionSerializer* in *TypeSerializerSchemaCompatibility* (like compatibleAfterMigration(TypeSerializer sche

Re: [DISCUSS] State Schema Evolution for RowData

2025-04-28 Thread Weiqing Yang
Thanks for the suggestions, Zakelly! Regarding *migrateElement* - it is specifically needed for ListState, which stores elements individually with delimiters. Its implementation deserializes and processes each element one by one during migration, so I introduced the *migrateElement* API to handle

Re: [DISCUSS] State Schema Evolution for RowData

2025-04-28 Thread Zakelly Lan
Hi Weiqiang, Thanks for your answers! It seems a simple deserialization-serialization lacks flexibility, thus I'd agree to introduce new methods. I'd suggest changing the signature to: ``` public void migrateState( TypeSerializerSnapshot oldSerializerSnapshot, DataI

Re: [DISCUSS] State Schema Evolution for RowData

2025-04-27 Thread Weiqing Yang
Hi Zakelly, Thanks for your feedback. You're right - *resolveSchemaCompatibility* is critical for identifying schema compatibility. However, our challenge extends beyond detection to handling the actual migration process, particularly given RowData’s complex requirements. The standard migration

Re: [DISCUSS] State Schema Evolution for RowData

2025-04-26 Thread Zakelly Lan
Hi, Weiqing Thanks for the FLIP! In general I'd +1 for schema evolution for RowData types, which will enhance the user experience of SQL jobs. I have one questions for now: You suggested introducing new methods in `TypeSerializerSnapshot`, but is it possible to leverage existing state migration