Ok, I'll answer my own question.

This is caused by the fact that hadoop uses
system.getProperty("path.separator") as the delimiter in the list of
jar files passed via -libjars.

If your job spans platforms, system.getProperty("path.separator")
returns a different delimiter on the different platforms.

My solution is to use a comma as the delimiter, rather than the path.separator.

I realize comma is, perhaps, a poor choice for a delimiter because it
is valid in filenames on both Windows and Linux, but the -libjars uses
it as the delimiter when listing the additional required jars.  So, I
figured if it's already being used as a delimiter, then it's
reasonable to use it internally as well.

I've attached a patch (against 0.19.0) that applies this change.

Now, with this change, I can submit hadoop jobs (requiring multiple
supporting jars) from my Windows laptop (via cygwin) to my 10-node
Linux hadoop cluster.

Any chance this change could be applied to the hadoop codebase?
diff -ur src/core/org/apache/hadoop/filecache/DistributedCache.java 
src_working/core/org/apache/hadoop/filecache/DistributedCache.java
--- src/core/org/apache/hadoop/filecache/DistributedCache.java  2008-11-13 
21:09:36.000000000 -0600
+++ src_working/core/org/apache/hadoop/filecache/DistributedCache.java  
2008-12-12 14:07:48.865460800 -0600
@@ -710,7 +710,7 @@
     throws IOException {
     String classpath = conf.get("mapred.job.classpath.archives");
     conf.set("mapred.job.classpath.archives", classpath == null ? archive
-             .toString() : classpath + System.getProperty("path.separator")
+             .toString() : classpath + ","
              + archive.toString());
     FileSystem fs = FileSystem.get(conf);
     URI uri = fs.makeQualified(archive).toUri();
@@ -727,8 +727,7 @@
     String classpath = conf.get("mapred.job.classpath.archives");
     if (classpath == null)
       return null;
-    ArrayList list = Collections.list(new StringTokenizer(classpath, System
-                                                          
.getProperty("path.separator")));
+    ArrayList list = Collections.list(new StringTokenizer(classpath, ","));
     Path[] paths = new Path[list.size()];
     for (int i = 0; i < list.size(); i++) {
       paths[i] = new Path((String) list.get(i));

Reply via email to