+1 (non-binding)* Built from source* Deployed on a pseudo-distributed cluster (MAC)* Ran wordcount and sleep jobs.
On Wednesday, January 25, 2017 3:21 AM, Marton Elek <me...@hortonworks.com> wrote: Hi, I also did a quick smoketest with the provided 3.0.0-alpha2 binaries: TLDR; It works well Environment: * 5 hosts, docker based hadoop cluster, every component in separated container (5 datanode/5 nodemanager/...) * Components are: * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary package for vote) * Zeppelin 0.6.2/0.7.0-RC2 * Spark 2.0.2/2.1.0 * HBase 1.2.4 + zookeeper * + additional docker containers for configuration management and monitoring * No HA, no kerberos, no wire encryption * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data) * Imported 100G data to HBase successfully * Started Spark jobs to process 1G json from HDFS (using spark-master/slave cluster). It worked even when I used the Zeppelin 0.6.2 + Spark 2.0.2 (with old hadoop client included). Obviously the old version can't use the new Yarn cluster as the token file format has been changed. * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the same json files from HDFS with spark jobs (from zeppelin) using yarn cluster (master: yarn deploy-mode: cluster) * Started spark jobs (with spark submit, master: yarn) to count records from the hbase database: OK * Started example Mapreduce jobs from distribution over yarn. It was OK but only with specific configuration (see bellow) So my overall impression that it works very well (at least with my 'smalldata') Some notes (none of them are blocking): 1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at command line: ./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha2.jar pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 10 10 And in the yarn-site: yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR I don't know the exact reason for the change, but the 2.7.3 was more userfriendly as the example could be run without specific configuration. For the same reason I didn't start hbase mapreduce job with hbase command line app (There could be some option for hbase to define MAPRED_HOME_DIR as well, but by default I got ClassNotFoundException for one of the MR class) 2. For the records: The logging and htrace classes are excluded from the shaded hadoop client jar so I added it manually one by one to the spark (spark 2.1.0 distribution without hadoop): RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm spark.tar.gz && mv spark* spark RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.0-alpha2.jar /opt/spark/jars RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-3.0.0-alpha2.jar /opt/spark/jars ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars ADD https://repo1.maven.org/maven2/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar /opt/spark/jars ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar /opt/spark/jars/ ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar /opt/spark/jars With this jars files spark 2.1.0 works well with the alpha2 version of HDFS and YARN. 3. The messages "Upgrade in progress. Not yet finalized." wasn't disappeared from the namenode webui but the cluster works well. Most probably I missed to do something, but it's a little bit confusing. (I checked the REST call, it is the jmx bean who reports that it was not yet finalized, the code of the webpage seems to be ok.) Regards Marton On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <yjzhan...@apache.org<mailto:yjzhan...@apache.org>> wrote: Thanks Andrew much for the work here! +1 (binding). - Downloaded both binary and src tarballs - Verified md5 checksum and signature for both - Built from source tarball - Deployed 2 pseudo clusters, one with the released tarball and the other with what I built from source, and did the following on both: - Run basic HDFS operations, snapshots and distcp jobs - Run pi job - Examined HDFS webui, YARN webui. Best, --Yongjun On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger <ebad...@yahoo-inc.com.invalid<mailto:ebad...@yahoo-inc.com.invalid>> wrote: +1 (non-binding) - Verified signatures and md5- Built from source- Started single-node cluster on my mac- Ran some sleep jobs Eric On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <flyrain...@gmail.com<mailto:flyrain...@gmail.com>> wrote: Hi Andrew, Thanks for working on this. +1 (Non-Binding) 1. Downloaded the binary and verified the md5. 2. Deployed it on 3 node cluster with 1 ResourceManager and 2 NodeManager. 3. Set YARN to use Fair Scheduler. 4. Ran MapReduce jobs Pi 5. Verified Hadoop version command output is correct. Best, Yufei On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <me...@hortonworks.com<mailto:me...@hortonworks.com>> wrote: ]> minicluster is kind of weird on filesystems that don't support mixed case, like OS X's default HFS+. $ jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep -i license LICENSE.txt license/ license/LICENSE license/LICENSE.dom-documentation.txt license/LICENSE.dom-software.txt license/LICENSE.sax.txt license/NOTICE license/README.dom.txt license/README.sax.txt LICENSE Grizzly_THIRDPARTYLICENSEREADME.txt I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to add the missing META-INF/LICENSE.txt to the shaded files. Question: what should be done with the other LICENSE files in the minicluster. Can we just exclude them (from legal point of view)? Regards, Marton --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org<mailto:yarn-dev-unsubscr...@hadoop.apache.org> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org<mailto:yarn-dev-h...@hadoop.apache.org>