Hi Rion

Is your job datastream or table/sql? If it is a table/sql job, and you can
define all the fields in json you need, then you can directly use json
format [1] to parse the data.

You can also customize udf functions to parse json data into struct data,
such as map, row and other types supported by flink


[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/json/

Best,
Shammon FY


On Sun, Mar 19, 2023 at 7:44 AM Rion Williams <rionmons...@gmail.com> wrote:

> Hi all,
>
> I’m reaching out today for some suggestions (and hopefully a solution) for
> a Flink job that I’m working on. The job itself reads JSON strings from a
> Kafka topic and reads those into JSONObjects (currently via Gson), which
> are then operated against, before ultimately being written out to Kafka
> again.
>
> The problem here is that the shape of the data can vary wildly and
> dynamically. Some records may have properties unique to only that record,
> which makes defining a POJO difficult. In addition to this, the JSONObjects
> fall by to Kryo serialization which is leading to atrocious throughput.
>
> I basically need to read in JSON strings, enrich properties on these
> objects, and ultimately write them to various sinks.  Is there some type of
> JSON-based class or library or an approach I could use to accomplish this
> in an efficient manner? Or if possibly a way to partially write a POJO that
> would allow me to interact with sections/properties of the JSON while
> retaining other properties that might be dynamically present or unique to
> the message?
>
> Any advice or suggestions would be welcome! I’ll also be happy to provide
> any additional context if it would help!
>
> Thanks,
>
> Rion
>
> (cross-posted to users+dev for reach)

Reply via email to