[ 
https://issues.apache.org/jira/browse/HDFS-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-355.
-----------------------------------

    Resolution: Fixed

Federation sort of fixes this. Closing.

> Ability to throttle DFS/MR so as not to overwhelm colo to colo switches
> -----------------------------------------------------------------------
>
>                 Key: HDFS-355
>                 URL: https://issues.apache.org/jira/browse/HDFS-355
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Pete Wyckoff
>
> Motivation:
> This would allow people to put data that is not used as often in non 
> co-located HDFS instance and when needed pulling it from the other cluster.
> This is useful in the context of Hive where a Metastore tells the runtime 
> system where the data is located (the full URI) or symbolic links.
> The problem:
> This will not work right now because it may overwhelm switches between the 
> two instances.  
> Workaround:
> Make the files unplittable or make your block size such that you only get 2-3 
> mappers.
> Possible solution:
> Throttle parallelism in the scheduler by specifying to run only X mappers for 
> a job no matter how many slots are free. (making some assumptions about the 
> reliability of the JobTracker's failure detector).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to