RE: lucene 4.3 seems to be much slower in indexing than lucene 3.6?

2013-07-26 Thread Zhang, Lisheng
Hi, Thanks very much for the confirmation! Yes, I will try to test more to find out. On the face value what I tried to index is pretty simple and I just used default merge policy (final merge time actually not very different between 3.6 and 4.3). But surely I can miss sth simple, will try to find

RE: Detect a corrupted index

2013-07-26 Thread Zhang, Lisheng
Hi, I used in the following code to detect data corruption in lucene 4.3.0: / import org.apache.lucene.index.CheckIndex; ... CheckIndex checkIndex = new CheckIndex(getLuceneDirectory(folderPath)); CheckIndex.Status status = checkIndex.checkIndex();

Detect a corrupted index

2013-07-26 Thread ABlaise
Hi everyone ! So I am working on a Lucene index that will run on a server and since this server might crash/be killed at any time, even during the creation of an index, I would like to be able to detect if an index is corrupted or not. I don't care about repairing it, rebuilding it from scratch do

Re: ERROR: could not read any segments file in directory

2013-07-26 Thread Prakash Chinnakannan
Thanks for your time Mike, Yes the commit has been made successfully before crashing. Here is the output of ls -lrt on searchIndex/ directory >> searchIndex#] ls -lrt total 50624356 -rw-r--r-- 1 root root 32991 Jul 26 06:42 _18vyk7.fnm -rw-r--r-- 1 root root 39608652 Jul 26 06:42 _18vyk7.fdx -r

Re: AnalyzingInfixSuggester

2013-07-26 Thread vonPuh fonPuhendorf
that is because they are not in the suggest but in src test folder 2013/7/26 vonPuh fonPuhendorf > using lucene 4.4 > > > 2013/7/26 vonPuh fonPuhendorf > >> Hello i am trying to build the example but TermFreqPayload >> and TermFreqPayloadArrayIterator are missing from suggest package also how

Re: lucene 4.3 seems to be much slower in indexing than lucene 3.6?

2013-07-26 Thread Nicolas Guyot
Hi Lisheng, first of all, on all my test cases, i can assure you lucene 4.3 is way more efficient than 3.6. Well after understanding and tweaking a few things ;) second can you help us understanding what is indexed and how? like what kind of fields? which merge policy ?... Thanks, Nicolas On Fr

Re: AnalyzingInfixSuggester

2013-07-26 Thread vonPuh fonPuhendorf
using lucene 4.4 2013/7/26 vonPuh fonPuhendorf > Hello i am trying to build the example but TermFreqPayload > and TermFreqPayloadArrayIterator are missing from suggest package also how > to pass to suggester.build method real index instead of mock words so it > can rebuild it. > >

AnalyzingInfixSuggester

2013-07-26 Thread vonPuh fonPuhendorf
Hello i am trying to build the example but TermFreqPayload and TermFreqPayloadArrayIterator are missing from suggest package also how to pass to suggester.build method real index instead of mock words so it can rebuild it.

lucene 4.3 seems to be much slower in indexing than lucene 3.6?

2013-07-26 Thread Zhang, Lisheng
Hi, I did some basic performance testing, just use random number to generate text for indexing, below I attached source java code. The command I used are: java TestReal43 index -docCount 500 -start 1 -optimize true -luceneDir mmap java TestReal36 index -docCount 500 -start 1 -optimize true

Re: ERROR: could not read any segments file in directory

2013-07-26 Thread Prakash Chinnakannan
Thanks for your time Mike, Yes the commit has been made successfully before crashing. Here is the output of ls -lrt on searchIndex/ directory >> searchIndex#] ls -lrt total 50624356 -rw-r--r-- 1 root root 32991 Jul 26 06:42 _18vyk7.fnm -rw-r--r-- 1 root root 39608652 Jul 26 06:42 _18vyk7.fdx -r

Re: ERROR: could not read any segments file in directory

2013-07-26 Thread Michael McCandless
Likely there's nothing easy you can do to recover the index. If the crash was merely an "outage", and the IO system did not flip bits on files that were committed, then the index should have been intact. Can you post the ls -l of the index directory? Had you successfully committed to this index

ERROR: could not read any segments file in directory

2013-07-26 Thread Prakash Chinnakannan
Hi, Today we'd the SAN outage and it looks the lucene index directory got corrupted. We tried to fix it by using CheckIndex and below is the exception trace. Do we've any other possible ways to recover the index contents? ~#] java -cp lucene-3.2.0.jar org. apache.lucene.index.CheckIndex searchIn

Re: Search a Part of the Sentence/Complete sentence in lucene 4.3

2013-07-26 Thread Michael McCandless
Have a look at the position argument to PhraseQuery.add: it lets you control where this new term is in the phrase. So to search for "wizard of oz" when of is a stopword you would add "wizard" at position 0 and "oz" at position 2. This is different from slop, which allows for "fuzzy" matching of t

Re: Search a Part of the Sentence/Complete sentence in lucene 4.3

2013-07-26 Thread Ankit Murarka
Hello can you elaborate more on this.. I seem to be lost over here.. Since I am new to lucene, so yesterday I was going through ShingleFilter and its application. Seems like its a kind of a N-Gram thing and it bloats the index as Mike have mentioned. As of now I am only concerned with the app