Thanks Andrew for creating release Hadoop 3.0.0-alpha2 RC0 +1 ( non-binding)
--Downloaded source and built from it. --Deployed on a pseudo-distributed cluster. --Ran sample MR jobs and tested with basics HDFS operations. --Did a sanity check for RM and NM UI. Best, zhihai On Wed, Jan 25, 2017 at 8:07 AM, Kuhu Shukla <kshu...@yahoo-inc.com.invalid> wrote: > +1 (non-binding)* Built from source* Deployed on a pseudo-distributed > cluster (MAC)* Ran wordcount and sleep jobs. > > > On Wednesday, January 25, 2017 3:21 AM, Marton Elek < > me...@hortonworks.com> wrote: > > > Hi, > > I also did a quick smoketest with the provided 3.0.0-alpha2 binaries: > > TLDR; It works well > > Environment: > * 5 hosts, docker based hadoop cluster, every component in separated > container (5 datanode/5 nodemanager/...) > * Components are: > * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary > package for vote) > * Zeppelin 0.6.2/0.7.0-RC2 > * Spark 2.0.2/2.1.0 > * HBase 1.2.4 + zookeeper > * + additional docker containers for configuration management and > monitoring > * No HA, no kerberos, no wire encryption > > * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data) > * Imported 100G data to HBase successfully > * Started Spark jobs to process 1G json from HDFS (using > spark-master/slave cluster). It worked even when I used the Zeppelin 0.6.2 > + Spark 2.0.2 (with old hadoop client included). Obviously the old version > can't use the new Yarn cluster as the token file format has been changed. > * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution > without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the > same json files from HDFS with spark jobs (from zeppelin) using yarn > cluster (master: yarn deploy-mode: cluster) > * Started spark jobs (with spark submit, master: yarn) to count records > from the hbase database: OK > * Started example Mapreduce jobs from distribution over yarn. It was OK > but only with specific configuration (see bellow) > > So my overall impression that it works very well (at least with my > 'smalldata') > > Some notes (none of them are blocking): > > 1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at > command line: > ./bin/yarn jar > share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha2.jar > pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" > -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 10 > 10 > > And in the yarn-site: > > yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME, > HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_ > DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR > > I don't know the exact reason for the change, but the 2.7.3 was more > userfriendly as the example could be run without specific configuration. > > For the same reason I didn't start hbase mapreduce job with hbase command > line app (There could be some option for hbase to define MAPRED_HOME_DIR as > well, but by default I got ClassNotFoundException for one of the MR class) > > 2. For the records: The logging and htrace classes are excluded from the > shaded hadoop client jar so I added it manually one by one to the spark > (spark 2.1.0 distribution without hadoop): > > RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm > spark.tar.gz && mv spark* spark > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar > /opt/spark/jars > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-3.0.0-alpha2.jar > /opt/spark/jars > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j- > log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars > ADD https://repo1.maven.org/maven2/org/apache/htrace/ > htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar > /opt/spark/jars > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1. > 7.10/slf4j-api-1.7.10.jar /opt/spark/jars/ > ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar > /opt/spark/jars > > With this jars files spark 2.1.0 works well with the alpha2 version of > HDFS and YARN. > > 3. The messages "Upgrade in progress. Not yet finalized." wasn't > disappeared from the namenode webui but the cluster works well. > > Most probably I missed to do something, but it's a little bit confusing. > > (I checked the REST call, it is the jmx bean who reports that it was not > yet finalized, the code of the webpage seems to be ok.) > > Regards > Marton > > On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <yjzhan...@apache.org<mailto:y > jzhan...@apache.org>> wrote: > > Thanks Andrew much for the work here! > > +1 (binding). > > - Downloaded both binary and src tarballs > - Verified md5 checksum and signature for both > - Built from source tarball > - Deployed 2 pseudo clusters, one with the released tarball and the other > with what I built from source, and did the following on both: > - Run basic HDFS operations, snapshots and distcp jobs > - Run pi job > - Examined HDFS webui, YARN webui. > > Best, > > --Yongjun > > > On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <ebad...@yahoo-inc.com.invalid > <mailto:ebad...@yahoo-inc.com.invalid>> > wrote: > > +1 (non-binding) > - Verified signatures and md5- Built from source- Started single-node > cluster on my mac- Ran some sleep jobs > Eric > > On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <flyrain...@gmail.com > <mailto:flyrain...@gmail.com>> > wrote: > > > Hi Andrew, > > Thanks for working on this. > > +1 (Non-Binding) > > 1. Downloaded the binary and verified the md5. > 2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager. > 3. Set YARN to use Fair Scheduler. > 4. Ran MapReduce jobs Pi > 5. Verified Hadoop version command output is correct. > > Best, > > Yufei > > On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <me...@hortonworks.com > <mailto:me...@hortonworks.com>> > wrote: > > ]> > minicluster is kind of weird on filesystems that don't support mixed > case, like OS X's default HFS+. > > $ jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep > -i > license > LICENSE.txt > license/ > license/LICENSE > license/LICENSE.dom-documentation.txt > license/LICENSE.dom-software.txt > license/LICENSE.sax.txt > license/NOTICE > license/README.dom.txt > license/README.sax.txt > LICENSE > Grizzly_THIRDPARTYLICENSEREADME.txt > > > I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to > add the missing META-INF/LICENSE.txt to the shaded files. > > Question: what should be done with the other LICENSE files in the > minicluster. Can we just exclude them (from legal point of view)? > > Regards, > Marton > > --------------------------------------------------------------------- > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org<mailto: > yarn-dev-unsubscr...@hadoop.apache.org> > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org<mailto: > yarn-dev-h...@hadoop.apache.org> > > > > > > > > >