non-local map task input

2013-07-21 Thread Grandl Robert
Hi guys, I am trying to figure out all the points in hdfs code where hdfs traffic is read/written. As far as I can tell, it seems most of the traffic goes through BlockSender/BlockReceiver, right ? However, when a client do a copyFromLocal, or read a file, or for a map task whose input is not

Re: non-local map task input

2013-07-22 Thread Grandl Robert
Can anyone help me with this please ? Thanks, Robert From: Grandl Robert To: "hdfs-dev@hadoop.apache.org" Sent: Sunday, July 21, 2013 8:41 PM Subject: non-local map task input Hi guys, I am trying to figure out all the points in hdfs code

HDFS writes a lot to disks

2015-04-17 Thread Grandl Robert
Hi, I am running some PIG queries atop Tez atop Yarn. My PIG query has a large stage which reads 45 GB data, and outputs less than 1 MB.  The stage is processed by 200 tasks, on 9 machines cluster with up to 8 tasks running in parallel, each with 7 GB memory. I am monitoring the resource usage