Re: data locality on HDFS

2010-05-07 Thread momina khan
hi i am still going in circles i still cant pin point a single function call that interacts with the HDFS for block locations... it is as if files are making circular calls to getBlockLocations() which is implemented such that it calls the same function in a different class ... i mean it is n

Re: [DISCUSSION]: Integrating SureLogic into Hadoop

2010-05-07 Thread Luke
On a related note, any objections to use http://code.google.com/p/thread-weaver/ in our unit tests? __Luke On Fri, May 7, 2010 at 4:35 PM, Todd Lipcon wrote: > Hi Cos, > > This looks great, and I'm excited to have more ways of finding these tricky > bugs. Are there any examples of bugs found alr

Re: [DISCUSSION]: Integrating SureLogic into Hadoop

2010-05-07 Thread Todd Lipcon
Hi Cos, This looks great, and I'm excited to have more ways of finding these tricky bugs. Are there any examples of bugs found already by these techniques? The one concern I have about the proposal is with this: > SureLogic analysis is going to be included to the test-patch process. This said new

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

2010-05-07 Thread Tom White
Here's my (single) slide about the 0.21 release. Tom On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy wrote: > # Shared goals >  - Hadoop is HDFS & Map-Reduce in this context of this set of slides > # Priorities >  * Yahoo >    - Correctness >    - Availability: Not the same as high-availability (6

Re: data locality on HDFS

2010-05-07 Thread Amogh Vasekar
Hi, The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to determine replicas. In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits() calls this method, which is passed on for job scheduling along with the split info. Hope this is what you were looking for. Am

data locality on HDFS

2010-05-07 Thread momina khan
hi, i am trying to figure out how hadoop uses data locality to schedule maps on nodes which locally store tha map input ... going through code i am going in circles in between a couple of file but not really getting anywhere ... that is to say that i cant locate the HDFS API or func that can commu