Nick it is a guess, but the only difference between my approach and yours is that I am optimizing as soon as I open the writer, and you are optimizing after the last (100th) document is written.
At the same time I am using: writer.setUseCompoundFile(true); writer.mergeFactor = 10; (I really wonder why this is public) I am also running it on linux, and this was the only way I solved the problem (I didnt even touch the ulimit). try this exact combination (compound file, optimizing every 100 documents before writting them, mergeFactor). please report back. paulo On 3/16/06, Hannes Carl Meyer <[EMAIL PROTECTED]> wrote: > Hi Nick, > > use 'ulimit' on your ix system to check if its set to unlimited. > > check: > http://wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/2/ulimit > > You don't have to set it to unlimited, maybe increasing the number will > help. > > later > > Hannes > > Nick Atkins schrieb: > > Thanks Otis, I tried that but I still get the same problem at the ulimit > > -n point. I assume you meant I should call > > IndexWriter.setUseCompoundFile(true). According to the docs compound > > structure is the default anyway. > > > > Any further thoughts? Anything I can tweak in the OS (Linux), Java > > (1.5.0) or Lucene (1.9.1)? > > > > Many thanks, > > > > Nick > > > > Otis Gospodnetic wrote: > > > >> The easiest first step to try is to go from multi-file index structure to > >> the compound one. > >> > >> Otis > >> > >> ----- Original Message ---- > >> From: Nick Atkins <[EMAIL PROTECTED]> > >> To: java-user@lucene.apache.org > >> Sent: Thursday, March 16, 2006 3:00:59 PM > >> Subject: Lucene and Tomcat, too many open files > >> > >> Hi, > >> > >> What's the best way to manage the number of open files used by Lucene > >> when it's running under Tomcat? I have a indexing application running > >> as a web app and I index a huge number of mail messages (upwards of > >> 40000 in some cases). Lucene's merging routine always craps out > >> eventually with the "too many open files" regardless of how large I set > >> ulimit to. lsof tells me they are all "deleted" but they still seem to > >> count as open files. I don't want to set ulimit to some enormous value > >> just to solve this (because it will never be large enough). What's the > >> best strategy here? > >> > >> I have tried setting various parameters on the IndexWriter such as the > >> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only > >> affect the merge timing algorithm wrt memory usage. The number of files > >> used seems to be unaffected by anything I can set on the IndexWriter. > >> > >> Any hints much appreciated. > >> > >> Cheers, > >> > >> Nick. > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Paulo E. A. Silveira Caelum Ensino e Soluções em Java http://www.caelum.com.br/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]