Increase maxBufferedDocs as much as you can (the more RAM the better). Increase max heap for the JVM (-Xmx). If you are on UNIX, increase the max open file descriptors limits and then increase mergeFactor somewhat, say 100. User -server mode for the JVM. Don't worry about RAMDir, those guys who wrote Lucene in Action were smoking something when they wrote that section. ;) Don't tokenize fields if you don't have to. Don't store term vectors if you don't need them.
Otis ----- Original Message ---- From: Alice <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, January 11, 2007 12:16:51 PM Subject: RE: Huge Index I never got to index all the data but it is too slow. I got 3 million in 2,5 hours. As suggested in Lucene in Action, I use ramDir and after I write 5000 documents I merge them to the fsDir. The merge factor is now 100 I tried other variations but didn't make much difference. -----Original Message----- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 15:07 To: java-user@lucene.apache.org Subject: Re: Huge Index Hi Alice, Can you define slow (hours, days, months and on what hardware)? Have you done any profiling, etc. to see where the bottlenecks are? What size documents are you talking about here? What are your merge factors, etc.? Thanks, Grant On Jan 11, 2007, at 10:47 AM, Alice wrote: > Hello! > > > > I have to index 37million documents retrieved from the database. > > > > I was trying to do by loading intervals of 10000 records but it is > too slow. > > > > Anybody could sugest a better way to get all the data indexed in a > reasonable time? > > > > Thanks > > > > > > > -------------------------- Grant Ingersoll Center for Natural Language Processing http://www.cnlp.org Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]