[ https://issues.apache.org/jira/browse/FLINK-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830330#comment-16830330 ]
Zhenqiu Huang commented on FLINK-12343: --------------------------------------- [~till.rohrmann][~rmetzger] I think we can set the hdfs.replication in YarnConfiguration of AbstractYarnClusterDescriptor. As, this configuration is only used in client side, so will not impact the runtime file replications. The reason I initially choose to use the setReplication method is that our org will use S3 for long term to submit job to different cluster management system, I want to apply the replication to both hdfs/s3. But It looks S3AFileSystem doesn't implement the method. I think it is good to use hdfs.replication initially. How do you think? > Allow set file.replication in Yarn Configuration > ------------------------------------------------ > > Key: FLINK-12343 > URL: https://issues.apache.org/jira/browse/FLINK-12343 > Project: Flink > Issue Type: Improvement > Components: Command Line Client, Deployment / YARN > Affects Versions: 1.6.4, 1.7.2, 1.8.0 > Reporter: Zhenqiu Huang > Assignee: Zhenqiu Huang > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, FlinkYarnSessionCli upload jars into hdfs with default 3 > replications. From our production experience, we find that 3 replications > will block big job (256 containers) to launch, when the HDFS is slow due to > big workload for batch pipelines. Thus, we want to make the factor > customizable from FlinkYarnSessionCli by adding an option. -- This message was sent by Atlassian JIRA (v7.6.3#76005)