Yeah. If I restart the spark interpreter and then run just import org.apache.spark.streaming._ val ssc = new StreamingContext(sc, Seconds(1))
I see no errors. So it's bound to be one of the previous steps. Anyway, I changed the maven project to 1.6.0 too -- still the same errors. I'll post it here when or if I solve this. Thanks anyway. FA On 27 January 2016 at 23:16, Jonathan Kelly <jonathaka...@gmail.com> wrote: > I have no idea if this is the cause, but you should > use org.apache.spark:spark-streaming-kinesis-asl_2.10:1.6.0 instead > of org.apache.spark:spark-streaming-kinesis-asl_2.10:1.5.2, since emr-4.3.0 > includes Spark 1.6.0. > > On Wed, Jan 27, 2016 at 3:00 PM Felipe Almeida <falmeida1...@gmail.com> > wrote: > >> Some of my last message is getting trimmed by google but the content is >> there. >> >> FA >> >> On 27 January 2016 at 22:59, Felipe Almeida <falmeida1...@gmail.com> >> wrote: >> >>> Alright, here is my full code: >>> >>> *//paragraph one* >>> z.load("org.apache.spark:spark-streaming-kinesis-asl_2.10:1.5.2") >>> >>> import com.amazonaws.auth.DefaultAWSCredentialsProviderChain >>> import com.amazonaws.regions.RegionUtils >>> import com.amazonaws.services.kinesis.AmazonKinesisClient >>> import >>> com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream >>> import com.amazonaws.services.s3.AmazonS3Client >>> >>> >>> import org.apache.spark.streaming.{Minutes, Seconds, StreamingContext} >>> import org.apache.spark.streaming.kinesis._ >>> import org.apache.spark.storage.StorageLevel >>> >>> import org.json4s.DefaultFormats >>> import org.json4s.jackson.JsonMethods.parse >>> >>> *//paragraph 2* >>> val credentials = new DefaultAWSCredentialsProviderChain().getCredentials >>> val endpointUrl = "kinesis.us-east-1.amazonaws.com" >>> val appName = "StoredashCollector-debug" >>> val streamName = XXXXX >>> val windowDuration = Seconds(10) >>> val rememberDuration = Minutes(7) >>> >>> val kinesisClient = new AmazonKinesisClient(credentials) >>> kinesisClient.setEndpoint(endpointUrl) >>> val numStreams = >>> kinesisClient.describeStream(streamName).getStreamDescription.getShards.size >>> val regionName = RegionUtils.getRegionByEndpoint(endpointUrl).getName() >>> >>> // problematic line here >>> val ssc = new StreamingContext(sc,windowDuration) >>> ssc.remember(rememberDuration) >>> >>> case class LogEvent( >>> DataType: Option[String] >>> , RequestType: Option[String] >>> , SessionId: Option[String] >>> , HostName: Option[String] >>> , TransactionAffiliation: Option[String] >>> , SkuStockOutFromProductDetail: List[String] >>> , SkuStockOutFromShelf: List[String] >>> , TransactionId: Option[String] >>> , ProductId: Option[String] >>> ) >>> >>> val streams = (0 until 1).map { i => >>> KinesisUtils.createStream(ssc, appName, streamName, endpointUrl, >>> regionName, >>> InitialPositionInStream.LATEST, windowDuration, >>> StorageLevel.MEMORY_ONLY) >>> } >>> >>> val unionStream = ssc.union(streams).map(byteArray => new >>> String(byteArray)) >>> >>> // used by json4s >>> implicit val formats = DefaultFormats >>> val eventsStream = unionStream.map{ >>> str => parse(str).extract[LogEvent] >>> } >>> >>> eventsStream.print(10) >>> >>> ssc.start() >>> >>> *// output of the previous paragraph including error message:* >>> >>> credentials: com.amazonaws.auth.AWSCredentials = >>> com.amazonaws.auth.BasicSessionCredentials@1181e472 endpointUrl: String >>> = kinesis.us-east-1.amazonaws.com appName: String = >>> StoredashCollector-debug streamName: String = XXXX windowDuration: >>> org.apache.spark.streaming.Duration = 10000 ms rememberDuration: >>> org.apache.spark.streaming.Duration = 420000 ms kinesisClient: >>> com.amazonaws.services.kinesis.AmazonKinesisClient = >>> com.amazonaws.services.kinesis.AmazonKinesisClient@275e9422 numStreams: >>> Int = 2 regionName: String = us-east-1 <console>:45: error: overloaded >>> method constructor StreamingContext with alternatives: (path: >>> String,sparkContext: >>> org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext >>> <and> (path: String,hadoopConf: >>> org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext >>> <and> (conf: org.apache.spark.SparkConf,batchDuration: >>> org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext >>> <and> (sparkContext: >>> org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration: >>> org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext >>> cannot be applied to >>> (org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext, >>> org.apache.spark.streaming.Duration) val ssc = new >>> StreamingContext(sc,windowDuration) >>> >>> >>> That's it; I've sent you my cluster Id per e-mail. >>> >>> FA >>> >>> On 27 January 2016 at 21:51, Jonathan Kelly <jonathaka...@gmail.com> >>> wrote: >>> >>>> That's a really weird issue! Sorry though, I can't seem to reproduce >>>> it, not in spark-shell nor in Zeppelin (on emr-4.3.0). This is all I'm >>>> running: >>>> >>>> import org.apache.spark.streaming._ >>>> val ssc = new StreamingContext(sc, Seconds(1)) >>>> >>>> Is there anything else you're running? Are you specifying any special >>>> configuration when creating the cluster? Do you have a cluster id I can >>>> take a look at? Also, can you reproduce this in spark-shell or just in >>>> Zeppelin? >>>> >>>> ~ Jonathan >>>> >>>> On Wed, Jan 27, 2016 at 12:46 PM Felipe Almeida <falmeida1...@gmail.com> >>>> wrote: >>>> >>>>> Hi Johnathan. >>>>> >>>>> That issue (using* z**.load("mavenproject"*) is solved indeed >>>>> (thanks), but now I can't create a StreamingContext from the current >>>>> SparkContext (variable sc). >>>>> >>>>> When trying to run *val ssc = new StreamingContext(sc,Seconds(1))*, I >>>>> get the following error: >>>>> >>>>> <console>:45: error: overloaded method constructor StreamingContext >>>>> with alternatives: (path: String,sparkContext: >>>>> org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext >>>>> <and> (path: String,hadoopConf: >>>>> org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext >>>>> <and> (conf: org.apache.spark.SparkConf,batchDuration: >>>>> org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext >>>>> <and> (sparkContext: >>>>> org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration: >>>>> org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext >>>>> cannot be applied to >>>>> (org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext, >>>>> org.apache.spark.streaming.Duration) val ssc = new >>>>> StreamingContext(sc,Seconds(1)) >>>>> >>>>> See how the modules leading up to the class name are funny: >>>>> org.apache.spark.org.apache.... it look's like it's it a loop. And one can >>>>> see in the error message that what I did was indeed use a SparkContext and >>>>> then a Duration, as per the docs. >>>>> >>>>> FA >>>>> >>>>> On 27 January 2016 at 02:14, Jonathan Kelly <jonathaka...@gmail.com> >>>>> wrote: >>>>> >>>>>> Yeah, I saw your self-reply on SO but did not have time to reply >>>>>> there also before leaving for the day. I hope emr-4.3.0 works well for >>>>>> you! >>>>>> Please let me know if you run into any other issues, especially with >>>>>> Spark >>>>>> or Zeppelin, or if you have any feature requests. >>>>>> >>>>>> Thanks, >>>>>> Jonathan >>>>>> On Tue, Jan 26, 2016 at 6:02 PM Felipe Almeida < >>>>>> falmeida1...@gmail.com> wrote: >>>>>> >>>>>>> Thank you very much Jonathan; I'll be sure to try out the new label >>>>>>> as soon as I can! >>>>>>> >>>>>>> I eventually found a workaround which involved passing the package >>>>>>> in the *--packages* for spark-submit, which was under >>>>>>> */usr/lib/zeppelin/conf/zeppelin.sh*: >>>>>>> http://stackoverflow.com/a/35006137/436721 >>>>>>> >>>>>>> Anyway, thanks for the reply, will let you know if the problem >>>>>>> persists. >>>>>>> >>>>>>> FA >>>>>>> >>>>>>> On 26 January 2016 at 23:19, Jonathan Kelly <jonathaka...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Felipe, >>>>>>>> >>>>>>>> I'm really sorry you had to deal with this frustration, but the >>>>>>>> good news is that we on EMR have fixed this issue and just released >>>>>>>> the fix >>>>>>>> in emr-4.3.0 today. (At least, it is available in some regions today >>>>>>>> and >>>>>>>> should be available in the other regions by some time tomorrow.) >>>>>>>> >>>>>>>> BTW, this was discussed on another thread last month: >>>>>>>> http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Zeppelin-notes-version-control-scheduler-and-external-deps-td1730.html >>>>>>>> (Not >>>>>>>> sure why I show up with the name "Work". Oh well. =P) >>>>>>>> >>>>>>>> ~ Jonathan >>>>>>>> >>>>>>>> On Mon, Jan 25, 2016 at 5:21 PM Felipe Almeida < >>>>>>>> falmeida1...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I cannot, for the love of *** make `z.load("maven-project")` work >>>>>>>>> for Zeppelin 0.5.5 on Amazon Elastic MapReduce. I keep getting A *Null >>>>>>>>> Pointer Exception* due to, I think, something related to Sonatype. >>>>>>>>> >>>>>>>>> I've also asked a question on SO about this: >>>>>>>>> http://stackoverflow.com/questions/35005455/java-npe-when-loading-a-dependency-from-maven-from-within-zeppelin-on-aws-emr >>>>>>>>> >>>>>>>>> Has anyone gone through this? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> FA >>>>>>>>> >>>>>>>>> -- >>>>>>>>> “Every time you stay out late; every time you sleep in; every time >>>>>>>>> you miss a workout; every time you don’t give 100% – You make it that >>>>>>>>> much >>>>>>>>> easier for me to beat you.” - Unknown author >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> “Every time you stay out late; every time you sleep in; every time >>>>>>> you miss a workout; every time you don’t give 100% – You make it that >>>>>>> much >>>>>>> easier for me to beat you.” - Unknown author >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> “Every time you stay out late; every time you sleep in; every time you >>>>> miss a workout; every time you don’t give 100% – You make it that much >>>>> easier for me to beat you.” - Unknown author >>>>> >>>> >>> >>> >>> -- >>> “Every time you stay out late; every time you sleep in; every time you >>> miss a workout; every time you don’t give 100% – You make it that much >>> easier for me to beat you.” - Unknown author >>> >> >> >> >> -- >> “Every time you stay out late; every time you sleep in; every time you >> miss a workout; every time you don’t give 100% – You make it that much >> easier for me to beat you.” - Unknown author >> > -- “Every time you stay out late; every time you sleep in; every time you miss a workout; every time you don’t give 100% – You make it that much easier for me to beat you.” - Unknown author