I'll bet the byte[] are the Norm data per field. If you have a lot of fields and do not need the normalization data for every field, I'd suggest turning that option off for fields you don't need the normalization for scoring. The calculation I understand is:

1 byte x (# fields with normalization turned on) x (# documents within the index)

adds up pretty quickly!

The char[] & String's will be your FieldCache's, probably used for sorting. Do you do any sorting other than by relevance?

cheers,

Paul

On 18/03/2008, at 8:57 AM, <[EMAIL PROTECTED]> wrote:

I'm running Lucene 2.3.1 with Java 1.5.0_14 on 64 bit linux. We have fairly large collections (~1gig collection files, ~1,000,000 documents). When I try to load test our application with 50 users, all doing simple searches via a web interface, we quickly get an OutOfMemory exception. When I do a jmap dump of the heap, this is what I see:

Size    Count   Class description
-------------------------------------------------------
195818576       4263822 char[]
190889608       13259   byte[]
172316640       4307916 java.lang.String
164813120       4120328 org.apache.lucene.index.TermInfo
131823104       4119472 org.apache.lucene.index.Term
37729184        604     org.apache.lucene.index.TermInfo[]
37729184        604     org.apache.lucene.index.Term[]

So 4 of the top 7 memory consumers are Term related. We have 2 gig of RAM available on the system but we get OOM errors no matter the java heap settings. Has anyone seen this issue and know how to solve it?

We do use separate MultiSearcher instances for each search. (We actually have 2 collections that we search via a MultiSearcher.) We tried using a singleton searcher instance but our collections are constantly being updated and the singleton searcher only gives you results since the searcher was opened. Creating new searcher objects at search time gives you up to the minute search results.

I've seen some postings referring to an Index Divisor setting which could reduce the Terms in memory, but I have not seen how to set this value for Lucene.

Any help would be greatly appreciated.

Rich

Paul Smith
Core Engineering Manager

Aconex
The easy way to save time and money on your project

696 Bourke Street, Melbourne,
VIC 3000, Australia
Tel: +61 3 9240 0200  Fax: +61 3 9240 0299
Email: [EMAIL PROTECTED]  www.aconex.com

This email and any attachments are intended solely for the addressee. The contents may be privileged, confidential and/or subject to copyright or other applicable law. No confidentiality or privilege is lost by an erroneous transmission. If you have received this e-mail in error, please let us know by reply e-mail and delete or destroy this mail and all copies. If you are not the intended recipient of this message you must not disseminate, copy or take any action in reliance on it. The sender takes no responsibility for the effect of this message upon the recipient's computer system.



Reply via email to