Re: NRT consistency

2011-04-11 Thread Mark Miller
On Apr 11, 2011, at 2:41 PM, Otis Gospodnetic wrote: > I think what's being described here is a lot like what I *think* > ElasticSearch > does, where there is no single master and index changed made to any node get > propagated to N-1 other nodes (N=number of index replicas). I'm not sure how

Re: NRT consistency

2011-04-11 Thread Mark Miller
On Apr 11, 2011, at 1:05 PM, Em wrote: > Thank you both! > > Mark, could you explain what you mean? I never heard from such an > index-splitter. BTW: The idea of having a segment per document sounds a lot > like an exception for too many FileDescriptors :) This is just an idea for rebalancing I

Re: NRT consistency

2011-04-11 Thread 张成
Something like dynamo's pattern, in the near real time searching, we should make N = W. 在 2011 4 11 23:52,"Mark Miller" 写道: > > On Apr 10, 2011, at 4:34 AM, Em wrote: > >> Hello list, >> >> I am currently trying to understand Lucene's Near-Real-Time-Feature which >> was covered in "Lucene in Actio

Can't perform exact match...?

2011-04-11 Thread Chris Mantle
Hi, I’m having some trouble with Lucene at the moment. I have a number of unique identifiers that I need to search through. They’re in many different forms, eg. “M”, “MO”, “:MOFB”, “FH..L-O”, etc. All I need to do is an exact prefix search: at the moment, if I type in ‘M’, I get “M”, “MO” and “:

Re: NRT consistency

2011-04-11 Thread Otis Gospodnetic
I think what's being described here is a lot like what I *think* ElasticSearch does, where there is no single master and index changed made to any node get propagated to N-1 other nodes (N=number of index replicas). I'm not sure how it deals with situations where "incompatible" index changes a

Re: NRT consistency

2011-04-11 Thread Michael McCandless
On Mon, Apr 11, 2011 at 1:05 PM, Em wrote: > Mike, as you said, the segments are flushed like normal. > Let's say my server dies for whatever reason, when restarting it and > reopening the index-writer: Does the IW deletes the flushed file, because it > is not mentioned in the segmentInfo - file

Re: NRT consistency

2011-04-11 Thread Em
Thank you both! Mark, could you explain what you mean? I never heard from such an index-splitter. BTW: The idea of having a segment per document sounds a lot like an exception for too many FileDescriptors :) Mike, as you said, the segments are flushed like normal. Let's say my server dies for wha

Re: NRT consistency

2011-04-11 Thread Mark Miller
On Apr 10, 2011, at 4:34 AM, Em wrote: > Hello list, > > I am currently trying to understand Lucene's Near-Real-Time-Feature which > was covered in "Lucene in Action, Second Edition". > > Let's say I got a distributed system with a master and a slave. > > In Solr replication is solved by check

Re: Difference between regular Highlighter and Fast Vector Highlighter ?

2011-04-11 Thread Mark Miller
The general and short answer is: Highlighter: highlights more query types, has a fairly rich API, doesn't scale well to very large documents (though https://issues.apache.org/jira/browse/LUCENE-2939 is going to help a lot here) - does not require that you store term vectors, but is faster if yo