Hi, I am building a search engine for text transcript documents from the database of an enterprise messaging system, and have designed a batch processing job to incrementally build the index,because the database from production is around huge, around 10G. Now I am still testing in DEV environment, and have been puzzled by this problem for a couple of days. If I build the index in one setting(because DEV database is very very small), the index is correct because I can get hits for my queries, also, what luke shows looks fine, 4800 documents, 450 terms. However, if I test building using my batch processing job, I do get the index which looks fine, but, when I search, it already returns 0 hits. I checked with Luke, which shows there are 5200 documents, 0 terms . There is no exception or runtime error or anything abnormal during indexing or searching, I am really at a loss. The only difference between the two is that: in the one setting approach, the whole index is built using the same indexwriter object. in the batch approach, an indexwriter object is opened per batch and closed when the batch is finished. But, I think I have taken care of it by IndexWriter writer = new IndexWriter(FSDir, Analyser, !FSdir.exists) Since lucene is designed for adding to exisiting index when the 3rd parameter is false, I do not understand where it went wrong. Should I have kept one singleton instance of the writer until all documents in the database are processed, rather than opening &closing one for each batch? Or, should I have kept a single instance of analyser? This does not seem necessary, but I really can not figure out where it went wrong, and how come this strange behavior: 520 documents but 0 terms.
I would be very grateful if anyone could advise. THanks very much. yanyan -- View this message in context: http://www.nabble.com/bad-index-by-batch-indexing-tp18862037p18862037.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]