Re: Optimize and Out Of Memory Errors

Lebiram Wed, 24 Dec 2008 05:20:06 -0800

Is there away to not factor in norms data in scoring somehow?

I'm just stumped as to how Luke is able to do a seach (with limit) on the docs 
but in my code it just dies with OutOfMemory errors.
How does Luke not allocate these norms?





________________________________
From: Mark Miller <markrmil...@gmail.com>
To: java-user@lucene.apache.org
Sent: Tuesday, December 23, 2008 5:25:30 PM
Subject: Re: Optimize and Out Of Memory Errors

Mark Miller wrote:
> Lebiram wrote:
>> Also, what are norms 
> Norms are a byte value per field stored in the index that is factored into 
> the score. Its used for length normalization (shorter documents = more 
> important) and index time boosting. If you want either of those, you need 
> norms. When norms are loaded up into an IndexReader, its loaded into a 
> byte[maxdoc] array for each field - so even if one document out of 400 
> million has a field, its still going to load byte[maxdoc] for that field (so 
> a lot of wasted RAM).  Did you say you had 400 million docs and 7 fields? 
> Google says that would be:
> 
> 
>    **400 million x 7 byte = 2 670.28809 megabytes**
> 
> On top of your other RAM usage.
Just to avoid confusion, that should really read a byte per document per field. 
If I remember right, it gives 255 boost possibilities, limited to 25 with 
length normalization.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Optimize and Out Of Memory Errors

Reply via email to