Re: Norm Value of not existing Field

2009-12-04 Thread Erick Erickson
The word "Filter" as part of a class is overloaded in Lucene See: http://lucene.apache.org/java/2_9_1/api/all/index.html The above filter is just a DocIdSet, one bit per document. So in your example, you're only talking 12M or so, even if you create one filter for every field and keep it aro

Re: Norm Value of not existing Field

2009-12-04 Thread Benjamin Heilbrunn
Erick, I'm not sure if I understand you right. What do you mean by "spinning through all the terms on a field". It would be an option to load all unique terms of a field by using TermEnum. Than use TermDocs to get the docs to those terms. The rest of docs doesn't contain a term and so you know, th

Re: Norm Value of not existing Field

2009-12-03 Thread Erick Erickson
It would be clumsier, but you could create a Filter by spinning through all the terms on a field and setting the appropriate bit. You could even do this at startup and store the filters around for all the fields you care about, or cache them when first used. The advantage I see here is that it wo

Re: Norm Value of not existing Field

2009-12-03 Thread Michael McCandless
This isn't easy to change; it's hardcoded, in oal.index.NormsWriter, to 1.0, and also in SegmentReader, to 1.0 (when the field doesn't have norms stored, but eg someone is requesting them anyway). 1.0 must encode to 124. I suppose we could empower Similarity to define what the "undefined norm val

Norm Value of not existing Field

2009-12-03 Thread Benjamin Heilbrunn
Hi, I'm using Lucene 2.9.1 patched with http://issues.apache.org/jira/browse/LUCENE-1260 For some special reason I need to find all documents which contain at least 1 term in a certain field. This works by iterating the norms array only as long as the field exists on every document. For documents