Here's the paper I was thinking of (Robert found this):
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.159.9682 ...
eg note this sentence from the abstract:
We show that the first implementation, based on a postprocessing
approach, allows an arbitrary user to obtain information about
Some more speed up may be possible when the same combination of
filters (user account and date range here) is reused for another query.
The combined filter can then be made as an OpenBitSetDISI
(in the util package) and kept around for reuse.
Regards,
Paul Elschot
Op zondag 24 oktober 2010 12:34:
er 24, 2010 12:34 PM
To: dev@lucene.apache.org
Subject: Re: Using filters to speed up queries
Here is what I've found so far:
I have three main sets to use in a query:
Account MUST be xxx
User query
DateRange on the query MUST be in (a,b) it is a NumericField
I tried the following com
ctober 24, 2010 12:34 PM
>
> *To:* dev@lucene.apache.org
> *Subject:* Re: Using filters to speed up queries
>
>
>
> Here is what I've found so far:
>
> I have three main sets to use in a query:
>
> Account MUST be xxx
>
> User query
>
> DateRange on
o: dev@lucene.apache.org
Subject: Re: Using filters to speed up queries
Here is what I've found so far:
I have three main sets to use in a query:
Account MUST be xxx
User query
DateRange on the query MUST be in (a,b) it is a NumericField
I tried the following combinations (all using a Boolea
Here is what I've found so far:
I have three main sets to use in a query:
Account MUST be xxx
User query
DateRange on the query MUST be in (a,b) it is a NumericField
I tried the following combinations (all using a BooleanQuery with the user
query added to it)
1. One:
- Add ACCOUNT as a TermQuery
Op zondag 24 oktober 2010 00:18:48 schreef Khash Sajadi:
> My index contains documents for different users. Each document has the user
> id as a field on it.
>
> There are about 500 different users with 3 million documents.
>
> Currently I'm calling Search with the query (parsed from user)
> and
Unfortunately, Lucene's performance with filters isn't great.
This is because we now always apply filters "up high", using a
leapfrog approach, where we alternate asking the filter and then the
scorer to skip to each other's docID.
But if the filter accepts "enough" (~1% in my testing) of the
doc
k]
Sent: Sunday, October 24, 2010 12:52 AM
To: dev@lucene.apache.org
Subject: Re: Using filters to speed up queries
On the topic of BooleanQuery. Would the order of the queries being added
matter? Is it clever enough to skip the second query when the first one is
returning nothing and is a MUST
On the topic of BooleanQuery. Would the order of the queries being added
matter? Is it clever enough to skip the second query when the first one is
returning nothing and is a MUST?
On 23 October 2010 23:47, Khash Sajadi wrote:
> Thanks. Will try it. Been thinking about separate indexes but have
Thanks. Will try it. Been thinking about separate indexes but have one
worry: memory and file handle issues.
I'm worried that in scenarios I might end up with thousands of
IndexReaders/IndexWriters open in the process (it is Windows). How is that
going to play out with memory?
On 23 October 2010
Look at BooleanQuery with 2 "must" clauses - one for the query, one for a
ConstantScoreQuery wrapping the filter.
BooleanQuery should then use automatically use skips when reading matching docs
from the main query and skip to the next docs identified by the filter.
Give it a try, otherwise you ma
12 matches
Mail list logo