Hi Zakelly, Thanks for your comments, please find my answers inline.
BR, G On Wed, Dec 18, 2024 at 2:42 PM Zakelly Lan <zakelly....@gmail.com> wrote: > Hi Gabor, > > Thanks for the FLIP! It's very convenient for users to use SQL to read > checkpoint data, so overall I'd +1 for this FLIP. Here are my > thoughts/questions: > > 1. I see this connector is designed to read the checkpoint/savepoint > data, so I'd suggest naming this connector as 'checkpoint' or > 'savepoint', > which is more intuitive. The "connector='state'" makes me think this is > reading the runtime state. Also the 'state.path' can be changed to > 'checkpoint.path'. > Rename is a possibility, I'm fine with either way in terms of naming. The only thing what I would consider is that later on it would be good to add metadata reading possibility. Such case we can either provide "savepoint" for data reading and "savepoint_metadata" or simply "metadata" for meta reading (See 3rd bullet for another approach). Please share your thoughts on this. > 2. IIUC this connector is a bounded source and batch execution is > preferred, right? And it cannot read checkpoints while the original job > is > running and doing periodic checkpoints. Thus I'd prefer the name > 'savepoint', as the lifecycle of savepoints is managed by the user > instead > of the Flink job. > We can call it savepoint, just agree on the metadata naming as well to have an overall picture. > 3. As you pictured, there might be more connectors for states other than > keyed state, so I'm thinking about providing them in one connector with > an > option 'state.type' enumerating 'keyed', 'non-keyed' ('operator'), and > so > on. Well for the first version this is not a required option. Seems more > organized. WDYT? > Yeah, later on we must provide such config parameter somehow. Couple of things to consider: - Maybe we can add "metadata" as possible value here instead of "connector=X" - I'm thinking about whether we can come up with something where we can mix metadata, keyed and non-keyed data in a single SQL statement. Without super deep consideration I would say no, but I'm going to think it through. > 4. I'm not sure if the general option 'key.format' is needed, would > using 'fields.k.format' while the 'k' is the name of the SQL primary > column > be feasible? > I think that's a nice catch of a simplification. > 5. Reading window state[1] is unsupported, right? > Yeah, not yet covered here and can come in a later FLIP. > > > [1] > > https://nightlies.apache.org/flink/flink-docs-master/docs/libs/state_processor_api/#window-state > > Best, > Zakelly > > On Wed, Dec 18, 2024 at 5:46 PM Gabor Somogyi <gabor.g.somo...@gmail.com> > wrote: > > > Hi All, > > > > I'd like to start a discussion of FLIP-496: SQL connector for keyed state > > data [1]. > > Feel free to add your thoughts to make this feature better. > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-496%3A+SQL+connector+for+keyed+state+data > > > > BR, > > G > > >