1. s3n is actually the URI for the Amazon S3 filesystem [1]. Normally HDFS URIs start with "hdfs://" and local URIs start with "file://". There is a default configured in your local Hadoop setup (fs.default.name). This [2] seems like a useful link.
2. See 1. :) 3. It looks like in the default case (the last else) is just uses whatever your default filesystem is. Chances are that it's file in your case. Setting the URI to hdfs://localhost:9000 (generically host:port and the port might be different on your machine) should fix it. Good luck! [1] http://wiki.apache.org/hadoop/AmazonS3 [2] http://www.greenplum.com/blog/dive-in/usage-and-quirks-of-fs-default-name-in-hadoop-filesystem On Tue, Mar 12, 2013 at 2:37 PM, Chris Harrington <[email protected]> wrote: > Hi all, > > The subject line says it all, ClusterDumper is writing to local file system > instead of HDFS. > > After looking at the source > > From the ClusterDumper class > > if (this.outputFile == null) { > shouldClose = false; > writer = new OutputStreamWriter(System.out); > } else { > shouldClose = true; > if (outputFile.getName().startsWith("s3n://")) { > Path p = outputPath; > FileSystem fs = FileSystem.get(p.toUri(), conf); > writer = new OutputStreamWriter(fs.create(p), Charsets.UTF_8); > } else { > writer = Files.newWriter(this.outputFile, Charsets.UTF_8); > } > } > > > From the Files class > > public static BufferedWriter newWriter(File file, Charset charset) > throws FileNotFoundException { > return new BufferedWriter( > new OutputStreamWriter(new FileOutputStream(file), charset)); > } > > > So a few questions on the above. > > 1. Am I correct in saying if the outputFile starts with "s3n://" it writes > to the HDFS other wise it writes to the local FS? > > 2. If the above is true then what is the meaning of a URI starting with > s3n:// > > 3. Is there a way to force it to write to the HDFS even if the URI doesn't > start with s3n:// or am I going to have to modify ClusterDumper class myself? > > >
