Here's a toy project - analyzing twitter stream. 1) Create dev. account on twitter 2) Using your dev credentials, connect to twitter stream api to retrieve stream of tweets 3) Store tweets in Kafka (using Kafka producer) 4) Retrieve tweets (using Kafka consumer) 5) For each tweet (or group of tweets), compute some analysis either using custom java OR use storm/samza/spark. e.g. country of origin of tweet, sentiment analysis etc.
Its very simple to do this and should not take you more than 1-2 days to implement. Thanks Manasvi On Sun, Sep 13, 2015 at 1:11 PM, Li Tao <ahumbleco...@gmail.com> wrote: > Hi Roger, > > Thanks for your recommendation. I just got to know Samza. and checked its > code base. It is a little too huge for me. > > Maybe for now, I need to start a small project/application which utilize > kafka as its infrastructure, so that I can use Kafka's API a lot and know > Kafka better. > > It's hard for me to initiate such project(small, useful/meaningful, kafka > based). Anyone has better idea? > > On Sun, Sep 13, 2015 at 2:21 PM, Roger Hoover <roger.hoo...@gmail.com> > wrote: > > > Hi Li, > > > > You might take a look at Apache Samza. It's conceptually simple but > > powerful and makes heavy use of Kafka. > > > > Best, > > > > Roger > > > > Sent from my iPhone > > > > > On Sep 12, 2015, at 10:34 PM, Li Tao <ahumbleco...@gmail.com> wrote: > > > > > > Hi Hackers, > > > > > > This is Lee, a learner of kafka, i have read the original paper on > kafka, > > > and walked through the document. > > > > > > I think the best way to learn sth is to write and read code about it. I > > am > > > wondering is there any open source code / system which is based on > kafka > > so > > > that i can read or contribute to? Not too complex, not too simple. > > > > > > Thanks a lot! > > >