I could resolve the conflict between my method trace and the details from the webUI.
I was modifying and compiling only on the master node. So, I found only node in the print trace. Now, I incorporated the prints in all the nodes and compiled them individually. Then started all the processes and ran a job. Each node was dealing with one node for partition replication. That way, storage details from the webUI and the print traces from all the nodes are both in sink. But, Where do I get all the replication nodes at a time? The getPeers() method calls askDriverWithReply(), which returns only one BlockManagerId in each node. Can someone please help me where to look for obtaining all the nodes chosen for replication. Thank you On Tue, Sep 16, 2014 at 12:04 PM, rapelly kartheek <kartheek.m...@gmail.com> wrote: > HI, > > I was tracing the flow of replicate method in BlockManager.scala. I am > trying to find out as to where exactly in the code, the resources are > aquired for rdd replication. > > I find that the BlockManagerMaster.getPeers() method returns only one > BlockManagerId for all the rdd partitions. > > But, the storage details of the rdd in WebUI depicts that all the > partitions are replicated over multiple nodes. > > I just want to find out where do we get the resources allocated for rdd > replication. > > Can someone please help me out in this regard!! > > Thank you > Karthik >