Enno Shioji created HADOOP-11444:
------------------------------------

             Summary: Jets3tFileSystemStore fails to remove initial slash from 
object keys, resulting in objects with double forward slashes being stored 
                 Key: HADOOP-11444
                 URL: https://issues.apache.org/jira/browse/HADOOP-11444
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 2.2.0
         Environment: java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
            Reporter: Enno Shioji
            Priority: Minor


While writing to S3 using Spark 1.2.0's ReceiverInputDStream#saveAsTextFiles 
with a S3 URL ("s3://fake-test/1234"), I noticed that files are written with 
double forward slashes (e.g. "s3://fake-test//1234/-1419334280000/").  

After debugging, it seems this is caused by 
Jets3tFileSystemStore#pathToKey(path), which returns "/fake-test/1234/..." for 
the input "s3://fake-test/1234/...". when it should hack off the first forward 
slash.

When I used a s3n URL and hence Jets3tNativeFileSystemStore, the double slashes 
went away. Here are the comparison between their pathToKey implementation:

 Jets3tNativeFileSystemStore's implementation of pathToKey is:
======
  private static String pathToKey(Path path) {
    if (path.toUri().getScheme() != null && path.toUri().getPath().isEmpty()) {
      // allow uris without trailing slash after bucket to refer to root,
      // like s3n://mybucket
      return "";
    }
    if (!path.isAbsolute()) {
      throw new IllegalArgumentException("Path must be absolute: " + path);
    }
    String ret = path.toUri().getPath().substring(1); // remove initial slash
    if (ret.endsWith("/") && (ret.indexOf("/") != ret.length() - 1)) {
      ret = ret.substring(0, ret.length() -1);
  }
    return ret;
  }
======

whereas Jets3tFileSystemStore uses:
======
  private String pathToKey(Path path) {
    if (!path.isAbsolute()) {
      throw new IllegalArgumentException("Path must be absolute: " + path);
    }
    return path.toUri().getPath();
  }
======



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to