Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-15 Thread Tathagata Das
Yes, what Nick said is the recommended way. In most usecases, a spark streaming program in production is not usually run from the shell. Hence, we chose not to make the external stuff (twitter, kafka, etc.) available to spark shell to avoid dependency conflicts brought it by them with spark's depen

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-15 Thread Nick Pentreath
You could try the following: create a minimal project using sbt or Maven, add spark-streaming-twitter as a dependency, run sbt assembly (or mvn package) on that to create a fat jar (with Spark as provided dependency), and add that to the shell classpath when starting up. On Tue, Jul 15, 2014 at 9

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-15 Thread Praveen Seluka
If you want to make Twitter* classes available in your shell, I believe you could do the following 1. Change the parent pom module ordering - Move external/twitter before assembly 2. In assembly/pom.xm, add external/twitter dependency - this will package twitter* into the assembly jar Now when spa

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Nicholas Chammas
Hmm, I'd like to clarify something from your comments, Tathagata. Going forward, is Twitter Streaming functionality not supported from the shell? What should users do if they'd like to process live Tweets from the shell? Nick On Mon, Jul 14, 2014 at 11:50 PM, Nicholas Chammas < nicholas.cham...

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Nicholas Chammas
> > At some point, you were able to access TwitterUtils from spark shell using > Spark 1.0.0+ ? Yep. > If yes, then what change in Spark caused it to not work any more? It still works for me. I was just commenting on your remark that it doesn't work through the shell, which I now understand t

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Tathagata Das
Oh right, that could have happened only after Spark 1.0.0. So let me clarify. At some point, you were able to access TwitterUtils from spark shell using Spark 1.0.0+ ? If yes, then what change in Spark caused it to not work any more? TD On Mon, Jul 14, 2014 at 7:52 PM, Nicholas Chammas < nichol

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Nicholas Chammas
If we're talking about the issue you captured in SPARK-2464 , then it was a newly launched EC2 cluster on 1.0.1. On Mon, Jul 14, 2014 at 10:48 PM, Tathagata Das wrote: > Did you make any updates in Spark version recently, after which you > notic

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Tathagata Das
Did you make any updates in Spark version recently, after which you noticed this problem? Because if you were using Spark 0.8 and below, then twitter would have worked in the Spark shell. In Spark 0.9, we moved those dependencies out of the core spark for those to update more freely without raising

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Nicholas Chammas
On Mon, Jul 14, 2014 at 6:52 PM, Tathagata Das wrote: > The twitter functionality is not available through the shell. > I've been processing Tweets live from the shell, though not for a long time. That's how I uncovered the problem with the Twitter receiver not deregistering, btw. Did I misunde

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Tathagata Das
I guess this is not clearly documented. At a high level, any class that is in the package org.apache.spark.streaming.XXX where XXX is in { twitter, kafka, flume, zeromq, mqtt } is not available in the Spark shell. I have added this to the larger JIRA of things-to-add-to-streaming-docs https://

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread durin
Thanks. Can I see that a Class is not available in the shell somewhere in the API Docs or do I have to find out by trial and error? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/import-org-apache-spark-streaming-twitter-in-Shell-tp9665p9678.html Sent from

Re: import org.apache.spark.streaming.twitter._ in Shell

2014-07-14 Thread Tathagata Das
The twitter functionality is not available through the shell. 1) we separated these non-core functionality into separate subprojects so that their dependencies do not collide/pollute those of of core spark 2) a shell is not really the best way to start a long running stream. Its best to use twitte