Hi! I'm not sure if this totally is relevant for you, but we use JSONSchema and JSON with Flink at the Wikimedia Foundation. We explicitly disallow the use of additionalProperties <https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#No_object_additionalProperties>, unless it is to define Map type fields <https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#map_types> (where additionalProperties itself is a schema).
We have JSONSchema converters and JSON Serdes to be able to use our JSONSchemas and JSON records with both the DataStream API (as Row) and Table API (as RowData). See: - https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json - https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/#managing-a-object State schema evolution is supported via the EventRowTypeInfo wrapper <https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/EventRowTypeInfo.java#42> . Less directly about Flink: I gave a talk at Confluent's Current conf in 2022 about why we use JSONSchema <https://www.confluent.io/events/current-2022/wikipedias-event-data-platform-or-json-is-okay-too/>. See also this blog post series if you are interested <https://techblog.wikimedia.org/2020/09/10/wikimedias-event-data-platform-or-json-is-ok-too/> ! -Andrew Otto Wikimedia Foundation On Fri, Feb 23, 2024 at 1:58 AM Salva Alcántara <salcantara...@gmail.com> wrote: > I'm facing some issues related to schema evolution in combination with the > usage of Json Schemas and I was just wondering whether there are any > recommended best practices. > > In particular, I'm using the following code generator: > > - https://github.com/joelittlejohn/jsonschema2pojo > > Main gotchas so far relate to the `additionalProperties` field. When > setting that to true, the resulting POJO is not valid according to Flink > rules because the generated getter/setter methods don't follow the java > beans naming conventions, e.g., see here: > > - https://github.com/joelittlejohn/jsonschema2pojo/issues/1589 > > This means that the Kryo fallback is used for serialization purposes, > which is not only bad for performance but also breaks state schema > evolution. > > So, because of that, setting `additionalProperties` to `false` looks like > a good idea but then your job will break if an upstream/producer service > adds a property to the messages you are reading. To solve this problem, the > POJOs for your job (as a reader) can be generated to ignore the > `additionalProperties` field (via the `@JsonIgnore` Jackson annotation). > This seems to be a good overall solution to the problem, but looks a bit > convoluted to me / didn't come without some trial & error (= pain & > frustration). > > Is there anyone here facing similar issues? It would be good to hear your > thoughts on this! > > BTW, this is very interesting article that touches on the above mentioned > difficulties: > - > https://www.creekservice.org/articles/2024/01/09/json-schema-evolution-part-2.html > > >