Hi all,
As we all know, the data node does not store all the block files in the same
directory but instead creates appropriate subdirectories. I am trying to
know how can one get to know this subdirectory name ?
I have a java program that takes a filename and gets the list of blocks that
make up the file and the list of hosts that host each of these blocks. Now I
need to know the subdirectory (if any) where the block is stored on a given
host.
This is my program so far:
class LocateBlocks {
public static void main(String[] args) throws Exception {
if(args.length != 1) {
System.out.println("error");
System.exit(1);
}
// filename.
String fileName = args[0];
// configuration object.
Configuration conf = new Configuration();
// get the DistributedFileSystem object.
DistributedFileSystem fs = (DistributedFileSystem)
FileSystem.get(URI.create(fileName), conf);
// get the corresponding DFSClient object.
DFSClient client = fs.getClient();
// get the namenode.
ClientProtocol nameNode = client.namenode;
// get file info.
HdfsFileStatus fStatus = nameNode.getFileInfo(fileName);
// get file size.
long fSize = fStatus.getLen();
// get block location information.
LocatedBlocks locBlks =
nameNode.getBlockLocations(fileName,0,fSize);
// get a list of blocks for this file.
List<LocatedBlock> locBlkList = locBlks.getLocatedBlocks();
// iterate over the list of blocks.
System.out.println(locBlkList.size());
for(LocatedBlock lBlk : locBlkList) {
// get block name.
Block blk = lBlk.getBlock();
String blkName = blk.getBlockName();
// foreach block get the array of hostnames on which the block
resides.
DatanodeInfo[] dNodes = lBlk.getLocations();
System.out.println(dNodes.length);
System.out.println(blkName);
for(DatanodeInfo dNode: dNodes) {
String hostName = dNode.getHostName();
System.out.println(hostName);
}
}
}
}
Thanks