Re: Long running tests failure

2011-09-27 Thread Milind.Bhandarkar
Disk space is not an issue on my MBP: Filesystem 1K-blocks Used Available Use% Mounted on /dev/disk0s2 488050672 198290788 289503884 41% / - milind On 9/23/11 4:15 PM, "Todd Lipcon" wrote: >I think TestLargeBlock fails if you have low disk space on your dev >machine -- since the

Re: Long running tests failure

2011-09-28 Thread Milind.Bhandarkar
Cool! I commented out that line as well. Can I override that parameter from commandline ? It will be less headache. - milind On 9/28/11 3:48 AM, "Jeff MAURY" wrote: >I faced the same issue on my brand new MPB (i7) laptop. >The Hadoop pom.xml are configured with a surefire plugin timeout of 900s

Re: Long running tests failure

2011-09-29 Thread Milind.Bhandarkar
Yeah ! That's really crazy. I don;t have time to debug it though. I will move my dev environment to linux. :-( - milind On 9/29/11 4:12 AM, "Jeff MAURY" wrote: >I have met a very strange behaviour: as it seems the Hadoop build does not >work on MacOS, I build an Ubuntu VM that I launched on my

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
As someone who has worked with hdfs-compatible distributed file systems that support append, I can vouch for its extensive usage. I have seen how simple it becomes to create tar archives, and later append files to them, without writing special inefficient code to do so. I have seen it used in arc

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
Answers inline. On 3/21/12 10:32 AM, "Eli Collins" wrote: > >Why not just write new files and use Har files, because Har files are a >pita? Yes, and har creation is an MR job, which is totally I/O bound, and yet takes up slots/containers, reducing cluster utilization. >Can you elaborate on the

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
>1. If the daily files are smaller than 1 block (seems unlikely) Even at a large hdfs installation, the avg file size was < 1.5 blocks. Bucketing causes the file sizes to drop. >2. The small files problem (a typical NN can store 100-200M files, so >a problem for big users) Big users probably ha

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
Eli, To clarify a little bit, I think HDFS-3120 is the right thing to do, to disable appends, while still enabling hsync in branch-1. But, going forward, (say 0.23+) having appends working correctly will definitely add value, and make HDFS more palatable for lots of other workloads. Of course, I

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
I would also like to point to work being done on PLFS-HDFS: http://institute.lanl.gov/isti/irhpit/presentations/PLFS-HDFS.pdf This would be made much simpler by allowing appends. Checkpointing in MPI is a very common use-case, and after Hamster, PLFS-HDFS becomes an attractive way to do this. (S

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
> >Absolutely, I'd like to learn more about what append/truncate buys us. Indeed. Lets postpone this discussion to Q2 then. Thanks, - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the vie

Re: [DISCUSS] Remove append?

2012-03-21 Thread Milind.Bhandarkar
Eli, If HDFS-3120 is committed to both 1.x and trunk/0.23.x, then one will be able to disable appends (while keeping hflush) using different config variables. By default (I.e. In hdfs-default.xlm), we should set dfs.support.append to false, and dfs.support.hsync to true. That way, we get enough t

Re: [DISCUSS] Remove append?

2012-03-22 Thread Milind.Bhandarkar
Eli, I think by "current definition of visible length", you mean that once a client opens a file and gets block list, it will always be able to read up to the length at open. However, correct me if I am wrong, but this definition is already violated, if file is deleted after open. So, truncate d

Re: [RESULT] - [VOTE] Rename hadoop branches post hadoop-1.x

2012-04-03 Thread Milind.Bhandarkar
Arun, I am even more confused now than I was before: Here you say: > Essentially 'trunk' is where incompatible changes *may* be committed (in >future). We should allow for that. On another thread, responding to Avner (re: MAPREDUCE-4049?) you say, > We do expect 'new features' to make it to t

Re: [RESULT] - [VOTE] Rename hadoop branches post hadoop-1.x

2012-04-03 Thread Milind.Bhandarkar
Thanks ATM. I guess the "*may*" emphasis confused me. Just to get some more clarity: What would be guideline for a new feature, such as https://issues.apache.org/jira/browse/MAPREDUCE-4049, which maintains compatibility for 1.x, but is not relevant to trunk, because the codebases have completely

Re: [RESULT] - [VOTE] Rename hadoop branches post hadoop-1.x

2012-04-03 Thread Milind.Bhandarkar
To my knowledge, shuffle is already pluggable in 0.23 onwards, as long as it is used only by mapreduce framework. That's why Avner says : "In parallel, I'll try to *learn what exists* in 0.23". (Emphasize my own.) That's why I was wondering about the insistence of committing to trunk first. - Mi

Re: [RESULT] - [VOTE] Rename hadoop branches post hadoop-1.x

2012-04-03 Thread Milind.Bhandarkar
Great ! Thanks @atm, - milind On 4/3/12 3:21 PM, "Aaron T. Myers" wrote: >If that's the case then there doesn't seem to be any question here. The >feature is in trunk, and an implementation could be done for an older >release branch that would be compatible with that branch. Sure, the code >to

Re: Make Hadoop NetworkTopology and data locality more pluggable for other deploying topology like: virtualization.

2012-06-04 Thread Milind.Bhandarkar
That's great Junping. Hoping to see this in trunk / hadoop 2.0 and hadoop 1.1 soon. - milind On Jun 4, 2012, at 8:48 AM, Jun Ping Du wrote: > Hello Folks, > I just filed a Umbrella jira today to address current NetworkTopology > issue that binding strictly to three tier network. The motiv

Long running tests failure

2011-09-23 Thread Milind.Bhandarkar
Folks, When running TestLargeBlock and TestBalancer, which tend to take a long time to run on my dev box, I get the following error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.6:test (default-test) on project hadoop-hdfs: Error while executing forked tests.; n