Hi, I am trying to write a WildcardFilter in order to prevent TooManyBooleanClauses and high memory usage. I wrap a Filter in a ConstantScoreQuery. I enumerate over the WildcardTerms for a query. This way I can set a maximum number of terms which i will evaluate. If too many terms match, I throw an exception. I also have a maximum number of documents which are allowed to match using BitsSets cardinality. I am not sure, if that is necessary, but I thought, if only a few terms, but a few million documents might match, that could also considerably slow down a search. I thought, I could get the TermDocs for each WildcardTerm and use:
int count = termDocs.read(docs, freqs); In order to have an optimized way to read not more than a maximum number of documents which match this term. I would then step through docs and set the bits for these documents. Sometimes this works, but sometimes this returns a different number search results. When I search for "content:test" in my index, I find 66 documents, but when I search for "t*st" with my WildcardFilter, I only find 23. There is only one term "test" matching this query and searching for this term in Luke also returns 66 documents. I found out that the SegmentTermDocs sets a variable df to "23", which leads to stop searching any further. Unfortunately I do not quite understand, where this variable really comes from and what it is for. I probably could just step through the TermDocs for each WildcardTerm. Is that should a correct (and not dramatically slow) way to find all documents? But I would like to understand, the difference in search results and what the method TermDocs.read(docs, freqs) method does and if my kind of filter does really make sense. I periodically rebuild my index and I wonder why my WildcardFilter sometimes returns the correct search results and sometimes not. What is the difference between steping through the term docs with termDocs.next() and using the read-method. Can anybodey explain that? Thanks in advance, Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]