Thank you, everyone, I'll try to take a look at the book suggested by Dave and Alexander... you're right, I was just looking for something else aside from logs, but it could be a good start, you're right :)
Thanks again! On Wed, Mar 7, 2012 at 6:52 PM, Gauthier, Alexander < alex.gauth...@teradata.com> wrote: > Sounds like you're looking for a "problem" to solve, you mentioned being a > "web developer" how about loading some web logs and try to do some > sessionization analysis? There are plenty of map-reduce functions out > there; doing just that (with minor modification to conform to your log > format).... that would be a good place to start thinking in term of "big > data" :) > > HTH. > > -----Original Message----- > From: Fernando Doglio [mailto:fernando.dog...@moove-it.com] > Sent: Tuesday, March 06, 2012 5:21 AM > To: common-dev@hadoop.apache.org > Subject: Looking for a place to start > > Hello everyone, this is my first mail to list. > > My question has probably been answered before, but I couldn't find a way > to search through the archives so.. here it goes: > > I've been toying around with Hadoop for a few weeks now, I've installed > Cloudera's VM, tried some of the examples, wrote the classic word count > example (seems like it's the "hello world" of Hadoop :P) using streaming > and now I'm looking for a bigger challenge. > > My main purpose of these tests is to train myself to think in "big data" > terms, instead of the classic approach a web developer takes when dealing > with information. > > So, taking all this into account, what would you recommend I try next? > I've been looking for a big source of data to work with, something to get > information out of. I know I could generate it myself, but I was hoping > that something like that would already exists somewhere. > > What where your next steps when starting out with this tech? > > Thanks in advance! > > Fernando >