stack <[email protected]> wrote on 06/06/2011 10:57:50 PM: > From: stack <[email protected]> > To: Mike Spreitzer/Watson/IBM@IBMUS > Cc: [email protected] > Date: 06/06/2011 10:58 PM > Subject: Re: Hadoop not working after replacing hadoop-core.jar with > hadoop-core-append.jar > > ... > > > Why does hbase include a hadoop-core.jar? The instructions say I should > > replace it, so why am I given it in the first place? > > > > You have to replace it if you are running on an hadoop that is other > than an exact match to the jar we ship with (If you are doing > standalone mode or if you are running unit tests, the jar is needed > since we have a bunch of Hadoop dependencies from our Configuration to > UI to MapReduce to Connection to HDFS etc.) > > ... > > Yours, > St.Ack
Let me see if I have got this straight. Hadoop branch-0.20-append is not an immutable thing, it has evolved a little over time. The hadoop-core.jar that is included in the HBase distribution was built from some version of branch-0.20-append. If my own Hadoop cluster is EXACTLY the same version of branch-0.20-append then I do not need to replace any files anywhere. However, since nobody is telling me the version of branch-0.20-append from which HBase's hadoop-core.jar was built, I can not in any case or way be confident that my cluster is running EXACTLY the same version even if it is branch-0.20-append. So the net result is that in all distributed cases (except when I import pre-built Hadoop+HBase from Cloudera or elsewhere) I have to build branch-0.20-append and copy it's core JAR into my HBase lib. Have I got this right? The book still does not say that clearly. In fact, the book still points to my old email saying I did it the other way around. Your reply above clearly seems to imply that I need to replace HBase's hadoop core JAR only in some distributed cases. Yet the rest of the email conversation on this point seems to have settled that HBase's hadoop core JAR needs to be replaced in all distributed cases. Thanks, Mike
