I would recommend not optimizing your index that often.  Another solution is to 
use the multisearcher and keep one fully optimized primary index, and an 
unoptimized secondary index that you add to.  Then search against both.  During 
off peak hours you could merge the secondary index onto your primary index, 
then optimize .

-----Original Message-----
From: Goel, Nikhil [mailto:[EMAIL PROTECTED]
Sent: Monday, April 04, 2005 10:14 PM
To: java-user@lucene.apache.org
Subject: Time taken in Indexing when the index is already huge


Hi, 

   

I have been using lucene-1.3.jar for quite some time and we are using another 
library to store the index in DB. 

When we started indexing  the writer.optimize used to take in the range of 
600-800 milliseconds to return but now our index has grown to huge proportion 
and its around 10 MB hence the writer.optimize is taking around 30-40 seconds 
and it is not acceptable for our solution. I put the timings on 
writer.optimize() and it's the one which takes most of this time. 

 

So I am just wondering if someone is facing the same problem in indexing the 
data when the index is already huge or is there another way to manage such huge 
index.

 

Here is the simple code which we use to index the data. 

IndexWriter writer = new IndexWriter(dbDirectory, new StandardAnalyzer(), 
false); //Create an indexwriter

writer.addDocument(doc); //doc is of type  
org.apache.lucene.document.Document...

writer.optimize(); //optimize is called on indexwriter..This is the one which 
takes most of the time and is responsible for the delay.

writer.close(); // indexwriter is closed

 

 

The time taken by optimize call grows a lot when the index is of larger size. I 
tried to look it up on Erik Hatcher and Otis Gospodnetić 
<http://www.manning.com/hatcher2#author#author>  book too but everywhere it 
says Lucene is quite scalable and don't have trouble in indexing even with huge 
data. Can anyone please provide  some insight into this?

 

Thanks.

Nikhil

 

 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to