[ https://issues.apache.org/jira/browse/HADOOP-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HADOOP-6140. -------------------------------------- Resolution: Duplicate I'm just going to dupe it to MAPREDUCE-752 at this point. > DistributedCache.addArchiveToClassPath doesn't work in 0.18.x branch > -------------------------------------------------------------------- > > Key: HADOOP-6140 > URL: https://issues.apache.org/jira/browse/HADOOP-6140 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 0.18.3 > Reporter: Vladimir Klimontovich > Attachments: HADOOP-6140-ver4.patch > > > addArchiveToClassPath is a method of DistributedCache class. It should be > called before running a task. It accepts path to a jar file on a DFS. After it > this method should put this jar file on sitribuuted cache and than add this > file to classpath to each map/reduce process on job tracker. > This method didn't work. > Bug 1: > addArchiveToClassPath adds DFS-path to archive to > mapred.job.classpath.archives property. It uses > System.getProperty("path.separator") as delimiter of multiple path. > getFileClassPaths that is called from TaskRunner uses splits > mapred.job.classpath.archives using System.getProperty("path.separator"). > In unix systems System.getProperty("path.separator") equals to ":". DFS-path > urls is hdfs://host:port/path. It means that a result of split will be > [ hdfs,//host,port/path]. > Suggested solution: use "," instead of > Bug 2: > in TaskRunner there is an algorithm that looks for correspondence between DFS > paths and local paths in distributed cache. > It compares > if (archives[i].getPath().equals( > > archiveClasspaths[j].toString())){ > instead of > if (archives[i].toString().equals( > > archiveClasspaths[j].toString())) -- This message was sent by Atlassian JIRA (v6.2#6252)