Hey Momina,
Here's the path on 20:
DistributedFileSystem#getFileBlockLocations
-> DFSClient#getFileBlockLocations
-> callGetBlockLocations
-> ClientProtocol#getBlockLocations
-> (via proxy) NameNode#getBlockLocations
See createNamenode and createRPCNamenode in the DFSClient
hi
i am still going in circles i still cant pin point a single
function call that interacts with the HDFS for block locations... it
is as if files are making circular calls to getBlockLocations() which
is implemented such that it calls the same function in a different
class ... i mean it is n
Hi,
The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to determine
replicas.
In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits()
calls this method, which is passed on for job scheduling along with the split
info.
Hope this is what you were looking for.
Am