[ 
https://issues.apache.org/jira/browse/FLINK-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830362#comment-16830362
 ] 

Zhenqiu Huang commented on FLINK-12343:
---------------------------------------

[~till.rohrmann]

I agree that we shouldn't define a default value, so that the replication can 
be easily fall back to the HDFS default in client side. As the method 
createTaskExecutorContext is mainly to register the remote files as local 
resources. Probably, we can just use the default setting in the yarn cluster? 
How do you think?

> Allow set file.replication in Yarn Configuration
> ------------------------------------------------
>
>                 Key: FLINK-12343
>                 URL: https://issues.apache.org/jira/browse/FLINK-12343
>             Project: Flink
>          Issue Type: Improvement
>          Components: Command Line Client, Deployment / YARN
>    Affects Versions: 1.6.4, 1.7.2, 1.8.0
>            Reporter: Zhenqiu Huang
>            Assignee: Zhenqiu Huang
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, FlinkYarnSessionCli upload jars into hdfs with default 3 
> replications. From our production experience, we find that 3 replications 
> will block big job (256 containers) to launch, when the HDFS is slow due to 
> big workload for batch pipelines. Thus, we want to make the factor 
> customizable from FlinkYarnSessionCli by adding an option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to