[ 
https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849138#comment-15849138
 ] 

Bill Liu edited comment on FLINK-5668 at 2/1/17 11:56 PM:
----------------------------------------------------------

thanks [~wheat9] for filling the full contexts.
YARN's own fault tolerance and high availability relies on HDFS , but It 
doesn't mean Flink-on-Yarn has to depend on HDFS.
Not to mention some of the HDFS dependency is not necessary at all.
For the taskmanager configuration file, 
I take a deep look at the code, 
The taskmaster-config is cloned from baseConfig and then made a very slitty 
change on it.
```
final Configuration taskManagerConfig = 
BootstrapTools.generateTaskManagerConfiguration(
                                        config, akkaHostname, akkaPort, 
slotsPerTaskManager, TASKMANAGER_REGISTRATION_TIMEOUT);

public static Configuration generateTaskManagerConfiguration(
                                Configuration baseConfig,
                                String jobManagerHostname,
                                int jobManagerPort,
                                int numSlots,
                                FiniteDuration registrationTimeout) {

                Configuration cfg = baseConfig.clone();

                cfg.setString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, 
jobManagerHostname);
                cfg.setInteger(ConfigConstants.JOB_MANAGER_IPC_PORT_KEY, 
jobManagerPort);
                
cfg.setString(ConfigConstants.TASK_MANAGER_MAX_REGISTRATION_DURATION, 
registrationTimeout.toString());
                if (numSlots != -1){
                        
cfg.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots);
                }

                return cfg; 
        }
``` 
[~StephanEwen],
If JobManager web server is not a good place to share files,  jobmanager don't 
need create a local taskmanager-config.yaml at all, it could just pass the the 
base config file and some dynamic properties to override  the value in base 
config.




was (Author: bill.liu8904):
thanks [~wheat9] for filling the full contexts.
YARN's own fault tolerance and high availability relies on HDFS , but It 
doesn't mean Flink-on-Yarn has to depend on HDFS.
Especially some of the HDFS dependency is not necessary at all.
For the taskmanager configuration file, 
I take a deep look at the code, 
The taskmaster-config is cloned from baseConfig and then made a very slitty 
change on it.
```
final Configuration taskManagerConfig = 
BootstrapTools.generateTaskManagerConfiguration(
                                        config, akkaHostname, akkaPort, 
slotsPerTaskManager, TASKMANAGER_REGISTRATION_TIMEOUT);

public static Configuration generateTaskManagerConfiguration(
                                Configuration baseConfig,
                                String jobManagerHostname,
                                int jobManagerPort,
                                int numSlots,
                                FiniteDuration registrationTimeout) {

                Configuration cfg = baseConfig.clone();

                cfg.setString(ConfigConstants.JOB_MANAGER_IPC_ADDRESS_KEY, 
jobManagerHostname);
                cfg.setInteger(ConfigConstants.JOB_MANAGER_IPC_PORT_KEY, 
jobManagerPort);
                
cfg.setString(ConfigConstants.TASK_MANAGER_MAX_REGISTRATION_DURATION, 
registrationTimeout.toString());
                if (numSlots != -1){
                        
cfg.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots);
                }

                return cfg; 
        }
``` 
[~StephanEwen],
If JobManager web server is not a good place to share files,  jobmanager don't 
need create a local taskmanager-config.yaml at all, it could just pass the the 
base config file and some dynamic properties to override  the value in base 
config.



> Reduce dependency on HDFS at job startup time
> ---------------------------------------------
>
>                 Key: FLINK-5668
>                 URL: https://issues.apache.org/jira/browse/FLINK-5668
>             Project: Flink
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Bill Liu
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When create a Flink cluster on Yarn,  JobManager depends on  HDFS to share  
> taskmanager-conf.yaml  with TaskManager.
> It's better to share the taskmanager-conf.yaml  on JobManager Web server 
> instead of HDFS, which could reduce the HDFS dependency  at job startup.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to