Hi all, I wanted to measure the time it takes to read input split for a map task. For my cluster, I am interested in measuring the overhead of fetching the input to a map task over the network as opposed to reading from the local disk.
Is there an easy way to instrument some function to log this information (say, in the TaskTracker logs)? Thanks, Abhishek