Mark Miller wrote:
Lebiram wrote:
Also, what are norms
Norms are a byte value per field stored in the index that is factored
into the score. Its used for length normalization (shorter documents =
more important) and index time boosting. If you want either of those,
you need norms. When norms are loaded up into an IndexReader, its
loaded into a byte[maxdoc] array for each field - so even if one
document out of 400 million has a field, its still going to load
byte[maxdoc] for that field (so a lot of wasted RAM). Did you say you
had 400 million docs and 7 fields? Google says that would be:
**400 million x 7 byte = 2 670.28809 megabytes**
On top of your other RAM usage.
Just to avoid confusion, that should really read a byte per document per
field. If I remember right, it gives 255 boost possibilities, limited to
25 with length normalization.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org