Re: Basic Hadoop Doubt

2010-05-17 Thread Hemanth Yamijala
Vamsi, > > I have some basic doubt on hadoop Input Data placement... > > Like, If i input some 30GB of data to hadoop program , it will place the > 30gb into HDFS  into some set of files based on some input formats.. Conceptually, it would be more accurate to say that it splits the data into 'blo

[jira] Created: (HADOOP-6769) Add an API in FileSystem to get FileSystem instances based on users

2010-05-17 Thread Devaraj Das (JIRA)
Add an API in FileSystem to get FileSystem instances based on users --- Key: HADOOP-6769 URL: https://issues.apache.org/jira/browse/HADOOP-6769 Project: Hadoop Common Issue Type

RE: Task scheduler

2010-05-17 Thread Segel, Mike
+1 I agree with Steve that sometimes you need to redirect where you want the work to occur. Over time, your cloud will not have homogenous data nodes. You may end up with a cluster of nodes that have a Fermi card (NVIDA CUDA enabled cards) where you want to do some serious number crunching. [ I

Re: Task scheduler

2010-05-17 Thread Steve Loughran
Saurabh Agarwal wrote: Hemanth, Thanks!! Saurabh Agarwal On Fri, May 14, 2010 at 9:49 AM, Hemanth Yamijala wrote: Saurabh, let me re frame my question I wanted to knowhow job tracker decides the assignment of input splits to task tracker based on task tracker's data locality. Where is th

Basic Hadoop Doubt

2010-05-17 Thread Vamc
Hi All, Vamc here, Buddy in Hadoop I have some basic doubt on hadoop Input Data placement... Like, If i input some 30GB of data to hadoop program , it will place the 30gb into HDFS into some set of files based on some input formats.. I have 2 doubts here .. 1. Each time i run a program 30GB

Re: What if an XML file cross boundary of HDFS chunks?

2010-05-17 Thread Vamc
Hi Steve, I am new to this forum and a buddy on Hadoop.. I have same kind of problem where input file is not able to treated as a text file .. Cant we do like this , Define our own InputFormat ,InputSplit and RecordReader.. Thanks Vamsi Jeff Zhang-4 wrote: > > Hi Steve, > > When you

[jira] Created: (HADOOP-6768) RPC client can response more efficiently when sendParam() got IOException

2010-05-17 Thread Xiao Kang (JIRA)
RPC client can response more efficiently when sendParam() got IOException - Key: HADOOP-6768 URL: https://issues.apache.org/jira/browse/HADOOP-6768 Project: Hadoop Common