Re: Sailfish

2012-05-09 Thread Otis Gospodnetic
Hi Sriram, >> The I-file concept could possibly be implemented here in a fairly self >> contained way. One >> could even colocate/embed a KFS filesystem with such an alternate >> shuffle, like how MR task temporary space is usually colocated with >> HDFS storage. >  Exactly. >> Does this seem r

Re: Web Crawler in hadoop - Unresponsive after a while

2011-10-13 Thread Otis Gospodnetic
Aishwarya, you should probably ask on the -user list. Moreover, you should probably just look at and use Nutch, which uses MR under the hood for fetching and other tasks - see http://nutch.apache.org/ Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem se

Re: MapReduce Usage in Search Engines

2010-07-30 Thread Otis Gospodnetic
MapReduce tends to be used for massive (re)indexing. See http://search-lucene.com/?q=hadoop+mapreduce&fc_project=Solr&fc_project=Lucene for how Lucene/Solr people are using MapReduce. For example, in a recent project we used MapReduce (streaming with jruby, actually) together with Solr (Embed