I know that for ADTs (sealed traits) there are some ongoing efforts to overcome the performance degradation caused by the kryo fallback (see https://github.com/apache/flink/pull/12929). E.g.,
``` sealed trait Event { def id: Int } case class Pageview(id: Int, page: String) extends Event case class Click(id: Int, url: String) extends Event ``` However, is there anything one can do for handling a similar situation but for unsealed traits. That is, imagine a situation where Event is not sealed because you don't know in advance which specific events you will be dealing with, or maybe you are working on a generic framework that will be used to develop specific applications afterwards, each one dealing with a set of particular events. So, basically, the trait cannot be sealed. Instead of simple case classes like Pageview and Click, within each specific application one could be dealing with auto-generated case classes from protocol buffer definitions, in a system which should be able to add more types of events into the mix, so to speak. How should one address this problem, if serialization performance wants to be optimized? The general advice is not to use Flink with heterogenous types to start with, because it will not be able to derive efficient serializers. But the described use case sounds legit to me, so what would be the best way to handle it, or, to put it another way, how to minimize performance degradation due to serialization? FYI: Posted also in SO: https://stackoverflow.com/questions/65085335/best-way-to-handle-data-streams-for-non-sealed-trait-hierarchies. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/