Re: stream of large objects

Chesnay Schepler Fri, 08 Feb 2019 05:46:49 -0800

Whether a LargeMessage is serialized depends on how the job is structured.

For example, if you were to only apply map/filter functions after theaggregation it is likely they wouldn't be serialized.

If you were to apply another keyBy they will be serialized again.


When you say "small size" messages, what are we talking about here?

On 07.02.2019 20:37, Aggarwal, Ajay wrote:

In my use case my source stream contain small size messages, but aspart of flink processing I will be aggregating them into largemessages and further processing will happen on these large messages.The structure of this large message will be something like this:
   Class LargeMessage {

      String key
List <String> messages; // this is where the aggregation ofsmaller messages happen
   }
In some cases this list field of LargeMessage can get very large(1000’s of messages). Is it ok to create an intermediate stream ofthese LargeMessages? What should I be concerned about while designingthe flink job? Specifically with parallelism in mind. As theseLargeMessages flow from one flink subtask to another, do they getserialized/deserialized ?
Thanks.

Re: stream of large objects

Reply via email to