[ https://issues.apache.org/jira/browse/FLINK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922810#comment-15922810 ]
Billy Newport commented on FLINK-6022: -------------------------------------- Thanks for this Robert. Basically to help with the ones we've implemented, we'd need a way of registering our schema objects on the ExecutionConfig and then looking them up on deserialization or a one off call when the ExecutionConfig is inflated would work also. To be honest, we'd just need a way of registering a map of serializable state on the ExecutionConfig. That would be all we would need at least. We are a little different than most I think in that we deal exclusively with GenericRecords with predeclared schemas, no code gened POJOs at all. We've kicked off the internal process of contributing so hopefully myself or Regina Chan (also here) can help contribute to this. > Improve support for Avro GenericRecord > -------------------------------------- > > Key: FLINK-6022 > URL: https://issues.apache.org/jira/browse/FLINK-6022 > Project: Flink > Issue Type: Improvement > Components: Type Serialization System > Reporter: Robert Metzger > > Currently, Flink is serializing the schema for each Avro GenericRecord in the > stream. > This leads to a lot of overhead over the wire/disk + high serialization costs. > Therefore, I'm proposing to improve the support for GenericRecord in Flink by > shipping the schema to each serializer through the AvroTypeInformation. > Then, we can only support GenericRecords with the same type per stream, but > the performance will be much better. -- This message was sent by Atlassian JIRA (v6.3.15#6346)