Hi,
I've implemented Twitter streaming as in the code given at the bottom of
email. It finds some tweets based on the hashtags I'm following. However,
it seems that a large amount of tweets is missing. I've tried to post some
tweets that I'm following in the application, and none of them was received
in application. I also checked some hashtags (e.g. #android) on Twitter
using Live and I could see that almost each second something was posted
with that hashtag, and my application received only 3-4 posts in one minute.
I didn't have this problem in earlier non-spark version of application
which used twitter4j to access user stream API. I guess this is some
trending stream, but I couldn't find anything that explains which Twitter
API is used in Spark Twitter Streaming and how to create stream that will
access everything posted on the Twitter.
I hope somebody could explain what is the problem and how to solve this.
Thanks,
Zoran
def initializeStreaming(){
> val config = getTwitterConfigurationBuilder.build()
> val auth: Option[twitter4j.auth.Authorization] = Some(new
> twitter4j.auth.OAuthAuthorization(config))
> val stream:DStream[Status] = TwitterUtils.createStream(ssc, auth)
> val filtered_statuses = stream.transform(rdd =>{
> val filtered = rdd.filter(status =>{
> var found = false
> for(tag <- hashTagsList){
> if(status.getText.toLowerCase.contains(tag)) {
> found = true
> }
> }
> found
> })
> filtered
> })
> filtered_statuses.foreachRDD(rdd => {
> rdd.collect.foreach(t => {
> println(t)
> })
> })
> ssc.start()
> }
>