There are actually several distributed indexing or searching projects in Lucene (the top-level ASF Lucene project, not Lucene Java), and it's time to start thinking about the possibility of bringing them together, finding commonalities, etc.
Here is the summary: - Lucene - distributed search via ParallelMultiSearcher. How you split indices/shards is up to you. - Solr - distributed indexing via SOLR-303 (see DistributedSearch on its Wiki). How you split indices/shards is up to you. - Nutch - see its org.apache.nutch.ipc (I think). How you split indices/segments is up to you. - Nutch - see the bottom of http://wiki.apache.org/nutch/Nutch2Architecture There is also Hadoop: - Using MapReduce + HDFS to build a single Lucene index in a distributed fashion (see contrib/ in Hadoop) There is also GridLucene project somewhere on the web... Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Grant Ingersoll <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Saturday, April 26, 2008 4:20:19 PM > Subject: Re: Does lucene support distributed indexing? > > > On Apr 26, 2008, at 2:33 AM, Samuel Guo wrote: > > > Hi all, > > > > I am a lucene newbie:) > > > > It seems that lucene doesn't support distributed indexing:( > > As some IR research papers mentioned, when the documents collection > > become > > large, the index will be large also. When one single machine can't > > hold all > > the index, some strategies are used to solve it. such as that we can > > part > > the whole collection into several small sub-collections. According to > > different partitions, we can got different strategies : document- > > partittion > > and term-partition. but I don't know why not lucene support these > > ways:( > > can't anyone explain it ? > > Because no one has donated the code to do it. You can do distributed > indexing via Nutch and some (albeit non fault tolerant) distributed > Search in Lucene. Solr also now has distributed search. > > -Grant > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]