Thanks for the response Rong. Would be happy to clarify more. So there are two possible changes that could happen:
1. There could be a change in the incoming source schema. Since there's a deserialization phase here (JSON -> Pojo) i expect a couple of options. Backward compatible changes to the JSON should not have an impact (as the Pojo is the same), however we might want to change the Pojo which i believe is a state evolving action. I do want to migrate the Pojo to Avro - will that suffice for Schema evolution feature to work? 2. The other possible change is the SQL select fields change, as mention someone could add/delete/change-order another field to the SQL Select. I do see this as an issue per the way i transform the Row object to the dynamically generated class. This is done today through indices of the class fields and the ones of the Row object. This seems like an issue for when for example a select field is added in the middle and now there's an older Row which fields order is not matching the (new) generated Class fields order. I'm thinking of how to solve that one - i imagine this is not something the schema evolution feature can solve (am i right?). im thinking on whether there is a way i can transform the Row object to my generated class by maybe the Row's column names corresponding to the generated class field names, though i don't see Row object has any notion of column names. Would love to hear your thoughts. If you want me to paste some code here i can do so. Shahar On Thu, Mar 7, 2019 at 10:40 AM Rong Rong <walter...@gmail.com> wrote: > Hi Shahar, > > I wasn't sure which schema are you describing that is going to "evolve" > (is it the registered_table? or the output sink?). It will be great if you > can clarify more. > > For the example you provided, IMO it is more considered as logic change > instead of schema evolution: > - if you are changing max(c) to max(d) in your query. I don't think this > qualifies as schema evolution. > - if you are adding another column "max(d)" to your query along with your > existing "max(c)" that might be considered as a backward compatible change. > However, either case you will have to restart your logic, you can also > consult how state schema evolution [1], and there are many other problems > that can be tricky as well[2,3]. > > Thanks, > Rong > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/schema_evolution.html > [2] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Window-operator-schema-evolution-savepoint-deserialization-fail-td23079.html > [3] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Savepointing-with-Avro-Schema-change-td19290.html#a19293 > > > On Wed, Mar 6, 2019 at 12:52 PM shkob1 <shahar.kobrin...@gmail.com> wrote: > >> Hey, >> >> My job is built on SQL that is injected as an input to the job. so lets >> take >> an example of >> >> Select a,max(b) as MaxB,max(c) as MaxC FROM registered_table GROUP BY a >> >> (side note: in order for the state not to grow indefinitely i'm >> transforming >> to a retracted stream and filtering based on a custom trigger) >> >> In order to get the output as a Json format i basically created a way to >> dynamically generate a class and registering it to the class loader, so >> when >> transforming to the retracted stream im doing something like: >> >> Table result = tableEnv.sqlQuery(sqlExpression); >> tableEnv.toRetractStream(result, Row.class, config) >> .filter(tuple -> tuple.f0) >> .map(new RowToDynamicClassMapper(sqlSelectFields)) >> .addSink(..) >> >> This actually works pretty good (though i do need to make sure to register >> the dynamic class to the class loader whenever the state is loaded) >> >> Im now looking into "schema evolution" - which basically means what >> happens >> when the query is changed (say max(c) is removed, and maybe max(d) is >> added). I dont know if that fits the classic "schema evolution" feature or >> should that be thought about differently. Would be happy to get some >> thoughts. >> >> Thanks! >> >> >> >> >> >> >> >> -- >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >> >