Re: asking about index verification tools

2010-11-17 Thread Erick Erickson
How could there be such a tool? Consider the number of ways that a given input stream can be defined. WordDelimiter, Stopwords, synonyms, etc. Eventually, you'd reconstruct all of the logic embedded in the analysis process in your checking program. Then you'd wonder if that was correct. There's qu

Re: asking about index verification tools

2010-11-17 Thread Yakob
yes you're correct.but I was just wondering my chances here though. are there any tools that do this crosschecking of index?or else when you make a search engine then you just feel complacent about it and feel the crosschecking of index isn't really necessary? what do you do in this situation? :-)

Re: asking about index verification tools

2010-11-17 Thread Anshum
Lance, CheckIndex would only check for the sanity of the index and not really if all words from the source got added into the index or not. CheckIndex would only check for corrupt indexes and in the process also take a lot of time. Perhaps what Yakob wanted here is just a cross check between the in

Re: asking about index verification tools

2010-11-17 Thread Lance Norskog
The Lucene CheckIndex program does this. It is a class somewhere in Lucene with a main() method. Samarendra Pratap wrote: It is not guaranteed that every term will be indexed. There is a limit on maximum number of terms (as in lucene 3.0 and may be earlier too) per field. Check out this http://

Re: asking about index verification tools

2010-11-16 Thread Samarendra Pratap
It is not guaranteed that every term will be indexed. There is a limit on maximum number of terms (as in lucene 3.0 and may be earlier too) per field. Check out this http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/index/IndexWriter.html#setMaxFieldLength(int) On Tue, Nov 16, 2010 at

Re: asking about index verification tools

2010-11-15 Thread Anshum
Hi, One way to do that would be to iterate the terms and then reconstruct the document or just check for the terms one after the other. Though Luke also reconstructs the document and you could use the reconstruction logic to do the same and compare, it is not guaranteed that the reconstruction wou

asking about index verification tools

2010-11-15 Thread Yakob
hello all, I would like to ask about lucene index. I mean I created a simple program that created lucene indexes and stored it in a folder. also I had use a diagnostic tools name Luke to be able to lurk inside lucene index and find out its content. and I know that lucene is a standard framework whe