I think your bottleneck is most likely the DB hit. I assume by 20000 products you mean 20000 distinct entries into the Lucene Index, i.e. 20000 rows in the DB to select from.
I index about 1.5 million rows from a SQL Server 2000 database with several fields for each entry and it finishes in about twenty minutes so you should be able to index 20000 rows in a few seconds. Make sure your database table(s) are indexed appropriately according to your select statements. Indexing correctly will be the biggest performance improvement you will see. Best of luck. KLCobb -----Original Message----- From: Mufaddal Khumri [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 2:11 PM To: java-user@lucene.apache.org Subject: Lucene bulk indexing Hi, I am sure this question must be raised before and maybe it has been even answered. I would be grateful, if someone could point me in the right direction or give their thoughts on this topic. The problem: I have approximately over 20000 products that I need to index. At the moment I get X number of products at a time and index them. This process takes about 26 minutes (Am indexing the database id, product name, product description). I was thinking of ways to make this indexing faster. For this I was thinking about writing a threaded module that would index X number of products simultaneously. For instance I could spawn (Number of products/X) number of threads and do the indexing. I am guessing this would be faster but by what factor would this be faster? (I understand the writes to the index are synchronized by lucene). Is there any other approach by which I could speed up the indexing? Thoughts? Suggestions? Thanks, Mufaddal. ------------------------------------------------------------------------ ------------------ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Finally, the recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. Consult your physician prior to the use of any medical supplies or product. ------------------------------------------------------------------------ ------------------ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]