Hello Otis, > > Hello Ard, > > What you are after is a higher mergeFactor and probably also > a higher maxBufferedDocs. Is indexing performance the concern?
No, this is not what I am after, and the mergeFactor isn't really solving my issue. My issue is very similar to (I read this thread later) the thread "maxDocs and Arrays" , http://www.gossamer-threads.com/lists/lucene/java-user/49285. I also want to keep some sort of derived data of lucene in memory arrays, to enable faceted authorized navigation in a jackrabbit (jcr) repository. I have tested for millions of "derived data documents" in a array and can very efficiently compute faceted auth nav. But, of course, as the lucene index changes, i need to update my derived data. For adding a document to lucene, i can normally just append an item to my derived data array, unless: 1) lucene did a merge, and 2) after the merge writer.docCount() != writerDoccountBeforeUpdate + 1 (this means the merge involved merging a segment where at least one deleted doc was present, reducing docCount) if 1 and 2 are true, then i need to recreate my derived data array, because the array locations do not coincide with those from lucene anymore. Therefore, i want to minimize merges (recreating the array is expensive), which of course can be done as you say by setting a large mergeFactor (and for example use compoundFile is true to reduce the number of files again) and a large maxBufferedDocs. But, increasing the default number of documents in the "smallest" segments from 10 to, say 100, would also help me. Then again, I am not sure wether i am doing something which can be achieved more effectively/simply, thanks in advance for any pointers, Regards Ard Schrijvers > Don't go crazy with setting a super high (e.g. 100+) > mergeFactor, unless you really have the number of open files > on your server(s) set to a solid/high number. maxBufferedDocs > can be set to a much higher number, typically, depending on > the size of the documents you are trying to index and the > amount of heap the JVM has to work with. There is also a new > API for explicit flushes of in-memory documents while > indexing to control memory consumption. > > Otis > -- > Lucene Consulting -- http://lucene-consulting.com/ > > > ----- Original Message ---- > From: Ard Schrijvers <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Friday, May 25, 2007 8:40:26 AM > Subject: RE: Setting the maximum number of documents in a > lucene segment > > > > > > Hello, > > > > I am trying to change the maximum number of documents in a > > lucene segment. By default it seems to be 10. > > Correction: 10 for the smallest (just created) segments of > course, because obviously merged segments are likely to > contain many more documents > > > When I have a > > mergeFactor of say 10, then on average, after every 100 added > > documents lucene is merging segments. > > > > I want each segment to contain more then the default 10 > > documents, because I need to minimize merging. > > > > Is there a way to achieve this? > > writer.setMaxBufferedDocs(largeValue) does not do the trick > > (I think because in my case because the writer is flushed and > > closed after an few updates) > > > > Does anyone know wether it is possible to make the default > > number of documents a segment can contain larger? > > > > Thanks in advance, > > > > Ard Schrijvers > > > > > > -- > > > > Hippo > > Oosteinde 11 > > 1017WT Amsterdam > > The Netherlands > > Tel +31 (0)20 5224466 > > ------------------------------------------------------------- > > [EMAIL PROTECTED] / http://www.hippo.nl > > -------------------------------------------------------------- > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]