Hi Sriram,
>> The I-file concept could possibly be implemented here in a fairly self
>> contained way. One
>> could even colocate/embed a KFS filesystem with such an alternate
>> shuffle, like how MR task temporary space is usually colocated with
>> HDFS storage.
> Exactly.
>> Does this seem r
Aishwarya, you should probably ask on the -user list.
Moreover, you should probably just look at and use Nutch, which uses MR under
the hood for fetching and other tasks - see http://nutch.apache.org/
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Hadoop ecosystem se
MapReduce tends to be used for massive (re)indexing.
See
http://search-lucene.com/?q=hadoop+mapreduce&fc_project=Solr&fc_project=Lucene
for how Lucene/Solr people are using MapReduce.
For example, in a recent project we used MapReduce (streaming with jruby,
actually) together with Solr (Embed