[ https://issues.apache.org/jira/browse/FLINK-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15652552#comment-15652552 ]
ASF GitHub Bot commented on FLINK-5046: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/2779 [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor information In order to speed up the serialization of the TaskDeploymentDescriptor we can pre serialize all information which stays the same for all TaskDeploymentDescriptors. The information which is static for a TDD is the job related information contained in the ExecutionGraph and the operator/task related information stored in the ExecutionJobVertex. In order to pre serialize this information, this PR introduces the JobInformation class and the TaskInformation class which are stored in serialized form in the ExecutionGraph and the ExecutionJobVertex, respectively. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink eagerStreamConfigSerialization Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2779.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2779 ---- commit fb7621a5a5023595a89d7e92562b503ec2a039e5 Author: Till Rohrmann <trohrm...@apache.org> Date: 2016-11-09T18:11:36Z [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor information In order to speed up the serialization of the TaskDeploymentDescriptor we can pre serialize all information which stays the same for all TaskDeploymentDescriptors. The information which is static for a TDD is the job related information contained in the ExecutionGraph and the operator/task related information stored in the ExecutionJobVertex. In order to pre serialize this information, this PR introduces the JobInformation class and the TaskInformration class which are stored in serialized form in the ExecutionGraph and the ExecutionJobVertex, respectively. ---- > Avoid redundant serialization when creating the TaskDeploymentDescriptor > ------------------------------------------------------------------------ > > Key: FLINK-5046 > URL: https://issues.apache.org/jira/browse/FLINK-5046 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination > Affects Versions: 1.2.0, 1.1.3 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Fix For: 1.2.0, 1.1.4 > > > When creating the {{TaskDeploymentDescriptor}} we extract information from > the {{ExecutionGraph}} which is defined job-wide and from the > {{ExecutionJobVertex}} which is defined operator-wide. The extracted > information will be serialized for every subtask even though it stays the > same. > As an improvement, we can serialize this information once and give the > serialized byte array to the {{TaskDeploymentDescriptor}}. This will reduce > the serialization work Flink has to do when deploying sub tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)