Re: Handling JSON Serialization without Kryo

2023-03-27 Thread Andrew Otto
Hi, > The problem here is that the shape of the data can vary wildly and dynamically. Some records may have properties unique to only that record, which makes defining a POJO difficult AFAIK, the only way to avoid POJOs in Flink is to use Row (DataStream) or RowData (Table API). These are Flink'

Re: Handling JSON Serialization without Kryo

2023-03-22 Thread Rion Williams
Hi Ken,I’m going to profile the job today to try and get a better handle on where the bottleneck is. The job currently just passes around JsonObjects between the operators, which are relying on Kryo. The job also writes to Postgres, Kafka, and Elasticsearch so it’s possible that one of those is cau

Re: Handling JSON Serialization without Kryo

2023-03-21 Thread Ken Krugler
Hi Rion, I’m using Gson to deserialize to a Map. 1-2 records/second sounds way too slow, unless each record is enormous. — Ken > On Mar 21, 2023, at 6:18 AM, Rion Williams wrote: > > Hi Ken, > > Thanks for the response. I hadn't tried exploring the use of the Record > class, which I'm assum

Re: Handling JSON Serialization without Kryo

2023-03-20 Thread Rion Williams
Hi Shammon,Unfortunately it’s a data stream job. I’ve been exploring a few options but haven’t found anything I’ve decided on yet. I’m currently looking at seeing if I can leverage some type of partial serialization to bind to the properties that I know the job will use and retain the rest as a JSO

Re: Handling JSON Serialization without Kryo

2023-03-20 Thread Shammon FY
Hi Rion Is your job datastream or table/sql? If it is a table/sql job, and you can define all the fields in json you need, then you can directly use json format [1] to parse the data. You can also customize udf functions to parse json data into struct data, such as map, row and other types suppor

Handling JSON Serialization without Kryo

2023-03-18 Thread Rion Williams
Hi all, I’m reaching out today for some suggestions (and hopefully a solution) for a Flink job that I’m working on. The job itself reads JSON strings from a Kafka topic and reads those into JSONObjects (currently via Gson), which are then operated against, before ultimately being written out t

Handling JSON Serialization without Kryo

2023-03-18 Thread Rion Williams
Hi all, I’m reaching out today for some suggestions (and hopefully a solution) for a Flink job that I’m working on. The job itself reads JSON strings from a Kafka topic and reads those into JSONObjects (currently via Gson), which are then operated against, before ultimately being written out to