Thats my goal brother. But lets agree spark is not a very straight forward repo to get yourself started.
I have got some initiaƶ code though. On Wednesday, 23 December 2015, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Why not create a custom dstream > <http://spark.apache.org/docs/latest/streaming-custom-receivers.html> and > generate the data from there itself instead of spark connecting to a socket > server which will be fed by another twitter client? > > Thanks > Best Regards > > On Sat, Dec 19, 2015 at 5:47 PM, Amir Rahnama <amirrahn...@gmail.com > <javascript:_e(%7B%7D,'cvml','amirrahn...@gmail.com');>> wrote: > >> Hi guys, >> >> Thought someone would need this: >> >> https://github.com/ambodi/realtime-spark-twitter-stream-mining >> >> you can use this approach to feed twitter stream to your spark job. So >> far, PySpark does not have a twitter dstream source. >> >> >> >> -- >> Thanks and Regards, >> >> Amir Hossein Rahnama >> >> *Tel: +46 (0) 761 681 102* >> Website: www.ambodi.com >> Twitter: @_ambodi <https://twitter.com/_ambodi> >> > > -- Thanks and Regards, Amir Hossein Rahnama *Tel: +46 (0) 761 681 102* Website: www.ambodi.com Twitter: @_ambodi <https://twitter.com/_ambodi>