[ 
https://issues.apache.org/jira/browse/FLINK-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654012#comment-15654012
 ] 

ASF GitHub Bot commented on FLINK-5046:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/2780

    [backport] [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor 
information

    This is a backport of #2779 for the release 1.1 branch.
    
    In order to speed up the serialization of the TaskDeploymentDescriptor we 
can pre serialize
    all information which stays the same for all TaskDeploymentDescriptors. The 
information which
    is static for a TDD is the job related information contained in the 
ExecutionGraph and the
    operator/task related information stored in the ExecutionJobVertex.
    
    In order to pre serialize this information, this PR introduces the 
JobInformation class
    and the TaskInformration class which are stored in serialized form in the 
ExecutionGraph
    and the ExecutionJobVertex, respectively.
    
    Fix for release-1.1

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink 
backportEagerStreamConfigSerialization

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2780.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2780
    
----
commit 8e7252a9a0411f00d8efcc2ba0e6c9e9ffd88989
Author: Till Rohrmann <trohrm...@apache.org>
Date:   2016-11-09T18:11:36Z

    [FLINK-5046] [tdd] Preserialize TaskDeploymentDescriptor information
    
    In order to speed up the serialization of the TaskDeploymentDescriptor we 
can pre serialize
    all information which stays the same for all TaskDeploymentDescriptors. The 
information which
    is static for a TDD is the job related information contained in the 
ExecutionGraph and the
    operator/task related information stored in the ExecutionJobVertex.
    
    In order to pre serialize this information, this PR introduces the 
JobInformation class
    and the TaskInformration class which are stored in serialized form in the 
ExecutionGraph
    and the ExecutionJobVertex, respectively.
    
    Fix for release-1.1

----


> Avoid redundant serialization when creating the TaskDeploymentDescriptor
> ------------------------------------------------------------------------
>
>                 Key: FLINK-5046
>                 URL: https://issues.apache.org/jira/browse/FLINK-5046
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>             Fix For: 1.2.0, 1.1.4
>
>
> When creating the {{TaskDeploymentDescriptor}} we extract information from 
> the {{ExecutionGraph}} which is defined job-wide and from the 
> {{ExecutionJobVertex}} which is defined operator-wide. The extracted 
> information will be serialized for every subtask even though it stays the 
> same. 
> As an improvement, we can serialize this information once and give the 
> serialized byte array to the {{TaskDeploymentDescriptor}}. This will reduce 
> the serialization work Flink has to do when deploying sub tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to