Ankur-

can you confirm that you got the stock JavaKinesisWordCountASL example
working on EMR per Chris' suggestion?

i want to stay ahead of any issues that you may encounter with the Kinesis
+ Spark Streaming + EMR integration as this is a popular stack.

Thanks!

-Chris

On Fri, Mar 27, 2015 at 7:56 AM, Bozeman, Christopher <bozem...@amazon.com>
wrote:

>  Ankur,
>
>
>
> The JavaKinesisWordCountASLYARN is no longer valid and was added just to
> the EMR build back in 1.1.0 to demonstrate Spark Streaming with Kinesis in
> YARN, just follow the stock example as seen in JavaKinesisWordCountASL as
> it is better form anyway given it is best not to hard code the master
> setting.
>
>
>
> Thanks
>
> Christopher
>
>
>
>
>
> *From:* Ankur Jain [mailto:ankur.j...@yash.com]
> *Sent:* Wednesday, March 25, 2015 10:24 PM
> *To:* Arush Kharbanda
> *Cc:* user@spark.apache.org
> *Subject:* RE: JavaKinesisWordCountASLYARN Example not working on EMR
>
>
>
> I had installed spark via bootstrap in EMR.
>
>
>
> https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark
>
>
>
> However when I run spark without yarn (local) and that one is working
> fine…..
>
>
>
> Thanks
>
> Ankur
>
>
>
> *From:* Arush Kharbanda [mailto:ar...@sigmoidanalytics.com]
> *Sent:* Wednesday, March 25, 2015 7:31 PM
> *To:* Ankur Jain
> *Cc:* user@spark.apache.org
> *Subject:* Re: JavaKinesisWordCountASLYARN Example not working on EMR
>
>
>
> Did you built for kineses using profile *-Pkinesis-asl*
>
>
>
> On Wed, Mar 25, 2015 at 7:18 PM, ankur.jain <ankur.j...@yash.com> wrote:
>
> Hi,
> I am trying to run a Spark on YARN program provided by Spark in the
> examples
> directory using Amazon Kinesis on EMR cluster :
> I am using Spark 1.3.0 and EMR AMI: 3.5.0
>
> I've setup the Credentials
> export AWS_ACCESS_KEY_ID=XXXXXX
> export AWS_SECRET_KEY=XXXXXXX
>
> *A) This is the Kinesis Word Count Producer which ran Successfully : *
> run-example org.apache.spark.examples.streaming.KinesisWordCountProducerASL
> mySparkStream https://kinesis.us-east-1.amazonaws.com 1 5
>
> *B) This one is the Normal Consumer using Spark Streaming which also ran
> Successfully: *
> run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASL
> mySparkStream https://kinesis.us-east-1.amazonaws.com
>
> *C) And this is the YARN based program which is NOT WORKING: *
> run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN
> mySparkStream https://kinesis.us-east-1.amazonaws.com\
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> 15/03/25 11:52:45 INFO spark.SparkContext: Running Spark version 1.3.0
> 15/03/25 11:52:45 WARN spark.SparkConf:
> SPARK_CLASSPATH was detected (set to
>
> '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar').
> This is deprecated in Spark 1.0+.
> Please instead use:
> •       ./spark-submit with --driver-class-path to augment the driver
> classpath
> •       spark.executor.extraClassPath to augment the executor classpath
> 15/03/25 11:52:45 WARN spark.SparkConf: Setting
> 'spark.executor.extraClassPath' to
>
> '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar'
> as a work-around.
> 15/03/25 11:52:45 WARN spark.SparkConf: Setting
> 'spark.driver.extraClassPath' to
>
> '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/:/home/hadoop/spark/classpath/emrfs/:/home/hadoop/share/hadoop/common/lib/:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar'
> as a work-around.
> 15/03/25 11:52:46 INFO spark.SecurityManager: Changing view acls to: hadoop
> 15/03/25 11:52:46 INFO spark.SecurityManager: Changing modify acls to:
> hadoop
> 15/03/25 11:52:46 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(hadoop); users with modify permissions: Set(hadoop)
> 15/03/25 11:52:47 INFO slf4j.Slf4jLogger: Slf4jLogger started
> 15/03/25 11:52:48 INFO Remoting: Starting remoting
> 15/03/25 11:52:48 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@ip-10-80-175-92.ec2.internal:59504]
> 15/03/25 11:52:48 INFO util.Utils: Successfully started service
> 'sparkDriver' on port 59504.
> 15/03/25 11:52:48 INFO spark.SparkEnv: Registering MapOutputTracker
> 15/03/25 11:52:48 INFO spark.SparkEnv: Registering BlockManagerMaster
> 15/03/25 11:52:48 INFO storage.DiskBlockManager: Created local directory at
>
> /mnt/spark/spark-120befbc-6dae-4751-b41f-dbf7b3d97616/blockmgr-d339d180-36f5-465f-bda3-cecccb23b1d3
> 15/03/25 11:52:48 INFO storage.MemoryStore: MemoryStore started with
> capacity 265.4 MB
> 15/03/25 11:52:48 INFO spark.HttpFileServer: HTTP File server directory is
>
> /mnt/spark/spark-85e88478-3dad-4fcf-a43a-efd15166bef3/httpd-6115870a-0d90-44df-aa7c-a6bd1a47e107
> 15/03/25 11:52:48 INFO spark.HttpServer: Starting HTTP Server
> 15/03/25 11:52:49 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 15/03/25 11:52:49 INFO server.AbstractConnector: Started
> SocketConnector@0.0.0.0:44879
> 15/03/25 11:52:49 INFO util.Utils: Successfully started service 'HTTP file
> server' on port 44879.
> 15/03/25 11:52:49 INFO spark.SparkEnv: Registering OutputCommitCoordinator
> 15/03/25 11:52:49 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 15/03/25 11:52:49 INFO server.AbstractConnector: Started
> SelectChannelConnector@0.0.0.0:4040
> 15/03/25 11:52:49 INFO util.Utils: Successfully started service 'SparkUI'
> on
> port 4040.
> 15/03/25 11:52:49 INFO ui.SparkUI: Started SparkUI at
> http://ip-10-80-175-92.ec2.internal:4040
> 15/03/25 11:52:50 INFO spark.SparkContext: Added JAR
> file:/home/hadoop/spark/lib/spark-examples-1.3.0-hadoop2.4.0.jar at
> http://10.80.175.92:44879/jars/spark-examples-1.3.0-hadoop2.4.0.jar with
> timestamp 1427284370358
> 15/03/25 11:52:50 INFO cluster.YarnClusterScheduler: Created
> YarnClusterScheduler
> 15/03/25 11:52:51 ERROR cluster.YarnClusterSchedulerBackend: Application ID
> is not set.
> 15/03/25 11:52:51 INFO netty.NettyBlockTransferService: Server created on
> 49982
> 15/03/25 11:52:51 INFO storage.BlockManagerMaster: Trying to register
> BlockManager
> 15/03/25 11:52:51 INFO storage.BlockManagerMasterActor: Registering block
> manager ip-10-80-175-92.ec2.internal:49982 with 265.4 MB RAM,
> BlockManagerId(, ip-10-80-175-92.ec2.internal, 49982)
> 15/03/25 11:52:51 INFO storage.BlockManagerMaster: Registered BlockManager
> *Exception in thread "main" java.lang.NullPointerException*
> *at
>
> org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:581)*
> at
>
> org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
> at org.apache.spark.SparkContext.(SparkContext.scala:541)
> at
>
> org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
> at org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:75)
> at
>
> org.apache.spark.streaming.api.java.JavaStreamingContext.(JavaStreamingContext.scala:132)
> at
>
> org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN.main(JavaKinesisWordCountASLYARN.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
>
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/JavaKinesisWordCountASLYARN-Example-not-working-on-EMR-tp22226.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
>
>
> --
>
> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>
> *Arush Kharbanda* || Technical Teamlead
>
> ar...@sigmoidanalytics.com || www.sigmoidanalytics.com
>
> Information transmitted by this e-mail is proprietary to YASH Technologies
> and/ or its Customers and is intended for use only by the individual or
> entity to which it is addressed, and may contain information that is
> privileged, confidential or exempt from disclosure under applicable law. If
> you are not the intended recipient or it appears that this mail has been
> forwarded to you without proper authority, you are notified that any use or
> dissemination of this information in any manner is strictly prohibited. In
> such cases, please notify us immediately at i...@yash.com and delete this
> mail from your records.
>

Reply via email to