Hey Momina,
Here's the path on 20:
DistributedFileSystem#getFileBlockLocations
-> DFSClient#getFileBlockLocations
-> callGetBlockLocations
-> ClientProtocol#getBlockLocations
-> (via proxy) NameNode#getBlockLocations
See createNamenode and createRPCNamenode in the DFSClient
hi
i am still going in circles i still cant pin point a single
function call that interacts with the HDFS for block locations... it
is as if files are making circular calls to getBlockLocations() which
is implemented such that it calls the same function in a different
class ... i mean it is n
Hi,
The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to determine
replicas.
In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits()
calls this method, which is passed on for job scheduling along with the split
info.
Hope this is what you were looking for.
Am
hi,
i am trying to figure out how hadoop uses data locality to schedule maps on
nodes which locally store tha map input ... going through code i am going in
circles in between a couple of file but not really getting anywhere ... that
is to say that i cant locate the HDFS API or func that can commu