Ankur- can you confirm that you got the stock JavaKinesisWordCountASL example working on EMR per Chris' suggestion?
i want to stay ahead of any issues that you may encounter with the Kinesis + Spark Streaming + EMR integration as this is a popular stack. Thanks! -Chris On Fri, Mar 27, 2015 at 7:56 AM, Bozeman, Christopher <bozem...@amazon.com> wrote: > Ankur, > > > > The JavaKinesisWordCountASLYARN is no longer valid and was added just to > the EMR build back in 1.1.0 to demonstrate Spark Streaming with Kinesis in > YARN, just follow the stock example as seen in JavaKinesisWordCountASL as > it is better form anyway given it is best not to hard code the master > setting. > > > > Thanks > > Christopher > > > > > > *From:* Ankur Jain [mailto:ankur.j...@yash.com] > *Sent:* Wednesday, March 25, 2015 10:24 PM > *To:* Arush Kharbanda > *Cc:* user@spark.apache.org > *Subject:* RE: JavaKinesisWordCountASLYARN Example not working on EMR > > > > I had installed spark via bootstrap in EMR. > > > > https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark > > > > However when I run spark without yarn (local) and that one is working > fine….. > > > > Thanks > > Ankur > > > > *From:* Arush Kharbanda [mailto:ar...@sigmoidanalytics.com] > *Sent:* Wednesday, March 25, 2015 7:31 PM > *To:* Ankur Jain > *Cc:* user@spark.apache.org > *Subject:* Re: JavaKinesisWordCountASLYARN Example not working on EMR > > > > Did you built for kineses using profile *-Pkinesis-asl* > > > > On Wed, Mar 25, 2015 at 7:18 PM, ankur.jain <ankur.j...@yash.com> wrote: > > Hi, > I am trying to run a Spark on YARN program provided by Spark in the > examples > directory using Amazon Kinesis on EMR cluster : > I am using Spark 1.3.0 and EMR AMI: 3.5.0 > > I've setup the Credentials > export AWS_ACCESS_KEY_ID=XXXXXX > export AWS_SECRET_KEY=XXXXXXX > > *A) This is the Kinesis Word Count Producer which ran Successfully : * > run-example org.apache.spark.examples.streaming.KinesisWordCountProducerASL > mySparkStream https://kinesis.us-east-1.amazonaws.com 1 5 > > *B) This one is the Normal Consumer using Spark Streaming which also ran > Successfully: * > run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASL > mySparkStream https://kinesis.us-east-1.amazonaws.com > > *C) And this is the YARN based program which is NOT WORKING: * > run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN > mySparkStream https://kinesis.us-east-1.amazonaws.com\ > Spark assembly has been built with Hive, including Datanucleus jars on > classpath > 15/03/25 11:52:45 INFO spark.SparkContext: Running Spark version 1.3.0 > 15/03/25 11:52:45 WARN spark.SparkConf: > SPARK_CLASSPATH was detected (set to > > '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar'). > This is deprecated in Spark 1.0+. > Please instead use: > • ./spark-submit with --driver-class-path to augment the driver > classpath > • spark.executor.extraClassPath to augment the executor classpath > 15/03/25 11:52:45 WARN spark.SparkConf: Setting > 'spark.executor.extraClassPath' to > > '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar' > as a work-around. > 15/03/25 11:52:45 WARN spark.SparkConf: Setting > 'spark.driver.extraClassPath' to > > '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar' > as a work-around. > 15/03/25 11:52:46 INFO spark.SecurityManager: Changing view acls to: hadoop > 15/03/25 11:52:46 INFO spark.SecurityManager: Changing modify acls to: > hadoop > 15/03/25 11:52:46 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(hadoop); users with modify permissions: Set(hadoop) > 15/03/25 11:52:47 INFO slf4j.Slf4jLogger: Slf4jLogger started > 15/03/25 11:52:48 INFO Remoting: Starting remoting > 15/03/25 11:52:48 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkDriver@ip-10-80-175-92.ec2.internal:59504] > 15/03/25 11:52:48 INFO util.Utils: Successfully started service > 'sparkDriver' on port 59504. > 15/03/25 11:52:48 INFO spark.SparkEnv: Registering MapOutputTracker > 15/03/25 11:52:48 INFO spark.SparkEnv: Registering BlockManagerMaster > 15/03/25 11:52:48 INFO storage.DiskBlockManager: Created local directory at > > /mnt/spark/spark-120befbc-6dae-4751-b41f-dbf7b3d97616/blockmgr-d339d180-36f5-465f-bda3-cecccb23b1d3 > 15/03/25 11:52:48 INFO storage.MemoryStore: MemoryStore started with > capacity 265.4 MB > 15/03/25 11:52:48 INFO spark.HttpFileServer: HTTP File server directory is > > /mnt/spark/spark-85e88478-3dad-4fcf-a43a-efd15166bef3/httpd-6115870a-0d90-44df-aa7c-a6bd1a47e107 > 15/03/25 11:52:48 INFO spark.HttpServer: Starting HTTP Server > 15/03/25 11:52:49 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/03/25 11:52:49 INFO server.AbstractConnector: Started > SocketConnector@0.0.0.0:44879 > 15/03/25 11:52:49 INFO util.Utils: Successfully started service 'HTTP file > server' on port 44879. > 15/03/25 11:52:49 INFO spark.SparkEnv: Registering OutputCommitCoordinator > 15/03/25 11:52:49 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/03/25 11:52:49 INFO server.AbstractConnector: Started > SelectChannelConnector@0.0.0.0:4040 > 15/03/25 11:52:49 INFO util.Utils: Successfully started service 'SparkUI' > on > port 4040. > 15/03/25 11:52:49 INFO ui.SparkUI: Started SparkUI at > http://ip-10-80-175-92.ec2.internal:4040 > 15/03/25 11:52:50 INFO spark.SparkContext: Added JAR > file:/home/hadoop/spark/lib/spark-examples-1.3.0-hadoop2.4.0.jar at > http://10.80.175.92:44879/jars/spark-examples-1.3.0-hadoop2.4.0.jar with > timestamp 1427284370358 > 15/03/25 11:52:50 INFO cluster.YarnClusterScheduler: Created > YarnClusterScheduler > 15/03/25 11:52:51 ERROR cluster.YarnClusterSchedulerBackend: Application ID > is not set. > 15/03/25 11:52:51 INFO netty.NettyBlockTransferService: Server created on > 49982 > 15/03/25 11:52:51 INFO storage.BlockManagerMaster: Trying to register > BlockManager > 15/03/25 11:52:51 INFO storage.BlockManagerMasterActor: Registering block > manager ip-10-80-175-92.ec2.internal:49982 with 265.4 MB RAM, > BlockManagerId(, ip-10-80-175-92.ec2.internal, 49982) > 15/03/25 11:52:51 INFO storage.BlockManagerMaster: Registered BlockManager > *Exception in thread "main" java.lang.NullPointerException* > *at > > org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:581)* > at > > org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32) > at org.apache.spark.SparkContext.(SparkContext.scala:541) > at > > org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642) > at org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75) > at > > org.apache.spark.streaming.api.java.JavaStreamingContext.(JavaStreamingContext.scala:132) > at > > org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN.main(JavaKinesisWordCountASLYARN.java:127) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/JavaKinesisWordCountASLYARN-Example-not-working-on-EMR-tp22226.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > > > > > -- > > [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> > > *Arush Kharbanda* || Technical Teamlead > > ar...@sigmoidanalytics.com || www.sigmoidanalytics.com > > Information transmitted by this e-mail is proprietary to YASH Technologies > and/ or its Customers and is intended for use only by the individual or > entity to which it is addressed, and may contain information that is > privileged, confidential or exempt from disclosure under applicable law. If > you are not the intended recipient or it appears that this mail has been > forwarded to you without proper authority, you are notified that any use or > dissemination of this information in any manner is strictly prohibited. In > such cases, please notify us immediately at i...@yash.com and delete this > mail from your records. >