[ https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886220#comment-15886220 ]
Bill Liu edited comment on FLINK-5668 at 2/27/17 6:00 PM: ---------------------------------------------------------- [~rmetzger] [~wheat9] and I are working on implementing a flink job deployer for a Yarn with {{HttpFs}} and {{S3}}. The Yarn Container could resolve the {{http/s3}} file scheme. We use HttpFs instead of HDFS to bootstrap the JobManager Here is the code to set up the AM container (JobManager) {code} Path resourcePath = new Path("http://localhost:19989/flink-dist.jar") FileStatus fileStatus = resourcePath.getFileSystem(yarnConfiguration) .getFileStatus(resourcePath); LOG.info("resource {}", ConverterUtils.getYarnUrlFromPath(resourcePath)); LocalResource packageResource = LocalResource.newInstance( ConverterUtils.getYarnUrlFromPath(resourcePath), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, fileStatus.getLen(), fileStatus.getModificationTime()); LOG.info("add localresource {}", packageResource); localResources.put("flink.jar", packageResource); amContainer.setLocalResources(localResources); {code} {{yarn.deploy.fs}} is not a goog idea, because these bootstrap jars/files may be located on different filesystem. It's better to parse the jar Path to get the underneath filesystem of jar. was (Author: bill.liu8904): [~rmetzger] [~wheat9] and I are working on implementing a flink job deployer for a Yarn with `HttpFs` and `S3`. The Yarn Container could resolve the `http/s3` file scheme. We use `HttpFs` instead of `HDFS` to bootstrap the JobManager Here is the code to set up the AM container (JobManager) ``` Path resourcePath = new Path("http://localhost:19989/flink-dist.jar") FileStatus fileStatus = resourcePath.getFileSystem(yarnConfiguration) .getFileStatus(resourcePath); LOG.info("resource {}", ConverterUtils.getYarnUrlFromPath(resourcePath)); LocalResource packageResource = LocalResource.newInstance( ConverterUtils.getYarnUrlFromPath(resourcePath), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, fileStatus.getLen(), fileStatus.getModificationTime()); LOG.info("add localresource {}", packageResource); localResources.put("flink.jar", packageResource); amContainer.setLocalResources(localResources); ``` `yarn.deploy.fs` is not a goog idea, because these bootstrap jars/files may be located on different filesystem. It's better to parse the jar Path to get the underneath filesystem of jar. > Reduce dependency on HDFS at job startup time > --------------------------------------------- > > Key: FLINK-5668 > URL: https://issues.apache.org/jira/browse/FLINK-5668 > Project: Flink > Issue Type: Improvement > Components: YARN > Reporter: Bill Liu > Original Estimate: 48h > Remaining Estimate: 48h > > When create a Flink cluster on Yarn, JobManager depends on HDFS to share > taskmanager-conf.yaml with TaskManager. > It's better to share the taskmanager-conf.yaml on JobManager Web server > instead of HDFS, which could reduce the HDFS dependency at job startup. -- This message was sent by Atlassian JIRA (v6.3.15#6346)