Hi Paul,
W.r.t. ConstantScoringQuery, it contains a minor bug: it doesn't the
handle the case where the Filter.bits method would return null. I think
the ConstantScorer should look like:
public boolean next() throws IOException {
doc = (bits == null) ? doc+1 : bits.nextSetBit(doc+1);
return doc >= 0;
}
public boolean skipTo(int target) throws IOException {
doc = (bits == null) ? target : bits.nextSetBit(target); //
requires JDK 1.4
return doc >= 0;
}
Anyways, while the ConstantScoringQuery does offer a great new feature,
it does not do what I want: when my custom filter bits method is called,
I still have to create a BitSet object with size reader.maxDoc(), which
value is the total number of documents in the index, and then determine
for each document if it should be included or excluded. This is
performance killing for me.
I have the same problem with HitCollector: I'd need to go through /all/
documents.
After playing around a bit, I think I could solve this by adding
TermQuery's as the last terms to a BooleanQuery, but that would mean I'd
also have to store certain (security related) values in separate fields
in the index. I could live with that, if I'm seeing this right: when I
set the boost factor of those fields to 0, then it won't affect the
scoring, right?
Regards,
Cret
Paul Elschot wrote:
On Saturday 17 December 2005 17:04, Cret Hummin wrote:
Hi All,
When using Searcher.search(Query, Filter), and I use my own custom
filter, it appears I'm presented with /all/ the documents in the index,
i.e. in the method bits(IndexReader reader) from my custom Filter, the
value of reader.maxDoc() is always the number of documents in the index.
The same is true when do Searcher.search(FilteredQuery(Query, Filter)).
Is it possible to filter /after/ the query has limited the number of
possible documents, /before/ returning a Hits collection?
The easiest way to do this is by adding a required clause to a BooleanQuery.
You might consider using a ConstantScoringQuery for this clause:
http://issues.apache.org/jira/browse/LUCENE-383
In case you really want to filter only the documents that match a query
you'll need to implement a filtering HitCollector and use it on the
lower level search API.
An easier way to implement such a filtering HitCollector could be
by adding to it the search methods that return a Hits as an alternative
to Filter.
A disadvantage of this approach is that skipTo() cannot be used
to combine the filter and the query, see also here:
http://issues.apache.org/jira/browse/LUCENE-330
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]