[ 
https://issues.apache.org/jira/browse/FLINK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922810#comment-15922810
 ] 

Billy Newport commented on FLINK-6022:
--------------------------------------

Thanks for this Robert. Basically to help with the ones we've implemented, we'd 
need a way of registering our schema objects on the ExecutionConfig and then 
looking them up on deserialization or a one off call when the ExecutionConfig 
is inflated would work also. To be honest, we'd just need a way of registering 
a map of serializable state on the ExecutionConfig. That would be all we would 
need at least.

We are a little different than most I think in that we deal exclusively with 
GenericRecords with predeclared schemas, no code gened POJOs at all. We've 
kicked off the internal process of contributing so hopefully myself or Regina 
Chan (also here) can help contribute to this.

> Improve support for Avro GenericRecord
> --------------------------------------
>
>                 Key: FLINK-6022
>                 URL: https://issues.apache.org/jira/browse/FLINK-6022
>             Project: Flink
>          Issue Type: Improvement
>          Components: Type Serialization System
>            Reporter: Robert Metzger
>
> Currently, Flink is serializing the schema for each Avro GenericRecord in the 
> stream.
> This leads to a lot of overhead over the wire/disk + high serialization costs.
> Therefore, I'm proposing to improve the support for GenericRecord in Flink by 
> shipping the schema to each serializer  through the AvroTypeInformation.
> Then, we can only support GenericRecords with the same type per stream, but 
> the performance will be much better.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to