I was waiting for Daniel to post the minutes from YARN meetup to talk about this. Anyways, in that discussion, we identified a bunch of key upgrade related scenarios that no-one seems to have validated - atleast from the representation in the YARN meetup. I'm going to create a wiki-page listing all these scenarios.
But back to the bug that Junping raised. At this point, we don't have a clear path towards running 2.x applications on 3.0.0 clusters. So, our claim of rolling-upgrades already working is not accurate. One of the two options that Junping proposed should be pursued before we close the release. I'm in favor of calling out rolling-upgrade support be with-drawn or caveated and push for progress instead of blocking the release. Thanks +Vinod > On Dec 12, 2017, at 5:44 PM, Junping Du <j...@hortonworks.com> wrote: > > Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just get > chance to validate new RC now. > > Basically, I found two critical issues with the same rolling upgrade scenario > as where HADOOP-15059 get found previously: > HDFS-12920, we changed value format for some hdfs configurations that old > version MR client doesn't understand when fetching these configurations. Some > quick workarounds are to add old value (without time unit) in hdfs-site.xml > to override new default values but will generate many annoying warnings. I > provided my fix suggestions on the JIRA already for more discussion. > The other one is YARN-7646. After we workaround HDFS-12920, will hit the > issue that old version MR AppMaster cannot communicate with new version of > YARN RM - could be related to resource profile changes from YARN side but > root cause are still in investigation. > > The first issue may not belong to a blocker given we can workaround this > without code change. I am not sure if we can workaround 2nd issue so far. If > not, we may have to fix this or compromise with withdrawing support of > rolling upgrade or calling it a stable release. > > > Thanks, > > Junping > > ________________________________________ > From: Robert Kanter <rkan...@cloudera.com> > Sent: Tuesday, December 12, 2017 3:10 PM > To: Arun Suresh > Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron T. > Myers; common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; > yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org > Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1 > > +1 (binding) > > + Downloaded the binary release > + Deployed on a 3 node cluster on CentOS 7.3 > + Ran some MR jobs, clicked around the UI, etc > + Ran some CLI commands (yarn logs, etc) > > Good job everyone on Hadoop 3! > > > - Robert > > On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh <asur...@apache.org> wrote: > >> +1 (binding) >> >> - Verified signatures of the source tarball. >> - built from source - using the docker build environment. >> - set up a pseudo-distributed test cluster. >> - ran basic HDFS commands >> - ran some basic MR jobs >> >> Cheers >> -Arun >> >> On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang <andrew.w...@cloudera.com> >> wrote: >> >>> Hi everyone, >>> >>> As a reminder, this vote closes tomorrow at 12:31pm, so please give it a >>> whack if you have time. There are already enough binding +1s to pass this >>> vote, but it'd be great to get additional validation. >>> >>> Thanks to everyone who's voted thus far! >>> >>> Best, >>> Andrew >>> >>> >>> >>> On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu <l...@cloudera.com> wrote: >>> >>>> +1 (binding) >>>> >>>> * Verified src tarball and bin tarball, verified md5 of each. >>>> * Build source with -Pdist,native >>>> * Started a pseudo cluster >>>> * Run ec -listPolicies / -getPolicy / -setPolicy on / , and run hdfs >>>> dfs put/get/cat on "/" with XOR-2-1 policy. >>>> >>>> Thanks Andrew for this great effort! >>>> >>>> Best, >>>> >>>> >>>> On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang <andrew.w...@cloudera.com >>> >>>> wrote: >>>>> Hi Wei-Chiu, >>>>> >>>>> The patchprocess directory is left over from the create-release >>> process, >>>>> and it looks empty to me. We should still file a create-release JIRA >> to >>>> fix >>>>> this, but I think this is not a blocker. Would you agree? >>>>> >>>>> Best, >>>>> Andrew >>>>> >>>>> On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang < >> weic...@cloudera.com >>>> >>>>> wrote: >>>>> >>>>>> Hi Andrew, thanks the tremendous effort. >>>>>> I found an empty "patchprocess" directory in the source tarball, >> that >>> is >>>>>> not there if you clone from github. Any chance you might have some >>>> leftover >>>>>> trash when you made the tarball? >>>>>> Not wanting to nitpicking, but you might want to double check so we >>>> don't >>>>>> ship anything private to you in public :) >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Dec 12, 2017 at 7:48 AM, Ajay Kumar < >>> ajay.ku...@hortonworks.com >>>>> >>>>>> wrote: >>>>>> >>>>>>> +1 (non-binding) >>>>>>> Thanks for driving this, Andrew Wang!! >>>>>>> >>>>>>> - downloaded the src tarball and verified md5 checksum >>>>>>> - built from source with jdk 1.8.0_111-b14 >>>>>>> - brought up a pseudo distributed cluster >>>>>>> - did basic file system operations (mkdir, list, put, cat) and >>>>>>> confirmed that everything was working >>>>>>> - Run word count, pi and DFSIOTest >>>>>>> - run hdfs and yarn, confirmed that the NN, RM web UI worked >>>>>>> >>>>>>> Cheers, >>>>>>> Ajay >>>>>>> >>>>>>> On 12/11/17, 9:35 PM, "Xiao Chen" <x...@cloudera.com> wrote: >>>>>>> >>>>>>> +1 (binding) >>>>>>> >>>>>>> - downloaded src tarball, verified md5 >>>>>>> - built from source with jdk1.8.0_112 >>>>>>> - started a pseudo cluster with hdfs and kms >>>>>>> - sanity checked encryption related operations working >>>>>>> - sanity checked webui and logs. >>>>>>> >>>>>>> -Xiao >>>>>>> >>>>>>> On Mon, Dec 11, 2017 at 6:10 PM, Aaron T. Myers < >> a...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> +1 (binding) >>>>>>>> >>>>>>>> - downloaded the src tarball and built the source (-Pdist >>>> -Pnative) >>>>>>>> - verified the checksum >>>>>>>> - brought up a secure pseudo distributed cluster >>>>>>>> - did some basic file system operations (mkdir, list, put, >> cat) >>>> and >>>>>>>> confirmed that everything was working >>>>>>>> - confirmed that the web UI worked >>>>>>>> >>>>>>>> Best, >>>>>>>> Aaron >>>>>>>> >>>>>>>> On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang < >>>>>>> andrew.w...@cloudera.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Let me start, as always, by thanking the efforts of all the >>>>>>> contributors >>>>>>>>> who contributed to this release, especially those who >> jumped >>> on >>>>>>> the >>>>>>>> issues >>>>>>>>> found in RC0. >>>>>>>>> >>>>>>>>> I've prepared RC1 for Apache Hadoop 3.0.0. This release >>>>>>> incorporates 302 >>>>>>>>> fixed JIRAs since the previous 3.0.0-beta1 release. >>>>>>>>> >>>>>>>>> You can find the artifacts here: >>>>>>>>> >>>>>>>>> http://home.apache.org/~wang/3.0.0-RC1/ >>>>>>>>> >>>>>>>>> I've done the traditional testing of building from the >> source >>>>>>> tarball and >>>>>>>>> running a Pi job on a single node cluster. I also verified >>> that >>>>>>> the >>>>>>>> shaded >>>>>>>>> jars are not empty. >>>>>>>>> >>>>>>>>> Found one issue that create-release (probably due to the >> mvn >>>>>>> deploy >>>>>>>> change) >>>>>>>>> didn't sign the artifacts, but I fixed that by calling mvn >>> one >>>>>>> more time. >>>>>>>>> Available here: >>>>>>>>> >>>>>>>>> https://repository.apache.org/ >> content/repositories/orgapache >>>>>>> hadoop-1075/ >>>>>>>>> >>>>>>>>> This release will run the standard 5 days, closing on Dec >>> 13th >>>> at >>>>>>> 12:31pm >>>>>>>>> Pacific. My +1 to start. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Andrew >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------ >>> --------- >>>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org >>>>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>>> -- >>>> Lei (Eddy) Xu >>>> Software Engineer, Cloudera >>>> >>> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org