Hi,

I have a single-node hadoop cluster. The hadoop version -
[patn...@ac4-dev-ims-211]~/dev/hadoop/hadoop-0.19.1% hadoop version
Hadoop 0.19.1
Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 
745977
Compiled by ndaley on Fri Feb 20 00:16:34 UTC 2009


Following is my hadoop-site.xml -
<configuration>
 <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

I have create some directories under my account and they show up correctly 
using "hadoop fs" shell command. 

[patn...@ac4-dev-ims-211]~/dev/hadoop/hadoop-0.19.1% hadoop fs -ls 
/user/patnala/tmp/allocation
Found 2 items
drwxr-xr-x   - patnala supergroup          0 2009-04-20 21:58 
/user/patnala/tmp/allocation/1
drwxr-xr-x   - patnala supergroup          0 2009-04-20 21:58 
/user/patnala/tmp/allocation/2

I am trying to retrieve the same information in Java through the 
org.apache.hadoop.fs package.

Following is my code - 

Configuration conf = new Configuration().
conf.addResource(new Path(hadoopConfigFile1)); -- hadoopConfigFile1 is 
hadoop-default.xml
conf.addResource(new Path(hadoopConfigFile1)); -- hadoopConfigFile1 is 
hadoop-site.xml
FileSystem fs = FileSystem.get(conf);

FileStatus[] listFiles = 
listFiles = fs.listStatus(path);
logger.debug("Obtained directory contents for ap store url, size - " + 
listFiles.length);

The output is below - 
DEBUG [main] (Configuration.java:176) - java.io.IOException: config()
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:176)
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:164)
        at 
com.yahoo.morocco.systems.optimization.client.planstore.GridPlanConsumer.main(GridPlanConsumer.java:94)

DEBUG [main] (GridPlanConsumer.java:38) - Allocating GridPlanConsumer with 
config  and allocation plan root URL - /user/patnala/tmp/allocation
DEBUG [main] (UnixUserGroupInformation.java:276) - Unix Login: 
patnala,dev,yahoodev,devmorocco,morocco-ims-dev
 INFO [main] (GridPlanConsumer.java:101) - Successfully allocationed plan 
consumer object
DEBUG [main] (GridPlanConsumer.java:54) - Obtained directory contents for ap 
store url, size - 0

The "size - 0" means that it couldn't retrieve any files or directories under - 
/user/patnala/tmp/allocation/.

I tried other paths and it seems like it's treating that path as a local 
directory. 

If I try specifying the complete HDFS path, I get some other exception below -
 INFO [main] (GridPlanConsumer.java:101) - Successfully allocationed plan 
consumer object
java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/, expected: 
file:///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:322)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:52)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:280)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:723)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:748)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:457)
        at 
com.yahoo.morocco.systems.optimization.client.planstore.GridPlanConsumer.getPlans(GridPlanConsumer.java:52)
        at 
com.yahoo.morocco.systems.optimization.client.planstore.GridPlanConsumer.main(GridPlanConsumer.java:103)
ERROR [main] (GridPlanConsumer.java:106) - Program exited with exception: Wrong 
FS: hdfs://localhost:9000/, expected: file:///

My question is, how do I parse directories and list files for data stored in 
the HDFS in java?

Thanks,
Praveen.

_________________________________________________________________
Rediscover HotmailĀ®: Get quick friend updates right in your inbox. 
http://windowslive.com/RediscoverHotmail?ocid=TXT_TAGLM_WL_HM_Rediscover_Updates2_042009

Reply via email to