[ https://issues.apache.org/jira/browse/FLINK-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654012#comment-15654012 ]
ASF GitHub Bot commented on FLINK-5046: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/2780 [backport] [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor information This is a backport of #2779 for the release 1.1 branch. In order to speed up the serialization of the TaskDeploymentDescriptor we can pre serialize all information which stays the same for all TaskDeploymentDescriptors. The information which is static for a TDD is the job related information contained in the ExecutionGraph and the operator/task related information stored in the ExecutionJobVertex. In order to pre serialize this information, this PR introduces the JobInformation class and the TaskInformration class which are stored in serialized form in the ExecutionGraph and the ExecutionJobVertex, respectively. Fix for release-1.1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink backportEagerStreamConfigSerialization Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2780.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2780 ---- commit 8e7252a9a0411f00d8efcc2ba0e6c9e9ffd88989 Author: Till Rohrmann <trohrm...@apache.org> Date: 2016-11-09T18:11:36Z [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor information In order to speed up the serialization of the TaskDeploymentDescriptor we can pre serialize all information which stays the same for all TaskDeploymentDescriptors. The information which is static for a TDD is the job related information contained in the ExecutionGraph and the operator/task related information stored in the ExecutionJobVertex. In order to pre serialize this information, this PR introduces the JobInformation class and the TaskInformration class which are stored in serialized form in the ExecutionGraph and the ExecutionJobVertex, respectively. Fix for release-1.1 ---- > Avoid redundant serialization when creating the TaskDeploymentDescriptor > ------------------------------------------------------------------------ > > Key: FLINK-5046 > URL: https://issues.apache.org/jira/browse/FLINK-5046 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination > Affects Versions: 1.2.0, 1.1.3 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Fix For: 1.2.0, 1.1.4 > > > When creating the {{TaskDeploymentDescriptor}} we extract information from > the {{ExecutionGraph}} which is defined job-wide and from the > {{ExecutionJobVertex}} which is defined operator-wide. The extracted > information will be serialized for every subtask even though it stays the > same. > As an improvement, we can serialize this information once and give the > serialized byte array to the {{TaskDeploymentDescriptor}}. This will reduce > the serialization work Flink has to do when deploying sub tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)