RE: RangeFilter performance problem using MultiReader

Uwe Schindler Sat, 11 Apr 2009 01:32:48 -0700

Hi Raf,

it would be nice to hear how it works for you!


Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Raf [mailto:r.ventag...@gmail.com]
> Sent: Saturday, April 11, 2009 9:22 AM
> To: java-user@lucene.apache.org
> Subject: Re: RangeFilter performance problem using MultiReader
> 
> Thanks Uwe,
> I had already read about TrieRangeFilter on this mailing list and I
> thought
> it could be useful to solve my problem.
> I think I will trie it for test purposes.
> 
> Unfortunately, I have now to solve the problem in a production system and
> I
> would like to avoid using a yet unreleased version. So I think that my
> best
> choice, at the moment, is to consolidate my indexes and waiting until this
> interesting new feature will be available in the official release.
> 
> Thanks a lot to all of you,
> Raf
> 
> 
> On Fri, Apr 10, 2009 at 10:13 PM, Uwe Schindler <u...@thetaphi.de> wrote:
> 
> > You got a lot of answers and questions about your index structure. Now
> > another idea, maybe this helps you to speed up your RangeFilter:
> >
> > What type of range do you want to query? From your index statistics, it
> > looks like a numeric/date field from which you filter very large ranges.
> If
> > the values are very fine-grained and so you hit a lot of terms for the
> > range, you might consider using TrieRangeFilter, which is a new contrib
> > module in the yet unreleased Lucene 2.9:
> >
> > http://hudson.zones.apache.org/hudson/job/Lucene-
> trunk/javadoc/all/org/apach
> > e/lucene/search/trie/package-
> summary.html<http://hudson.zones.apache.org/hudson/job/Lucene-
> trunk/javadoc/all/org/apach%0Ae/lucene/search/trie/package-summary.html>
> >
> > The name and API may change before release (if it moves to core), but
> you
> > can try it out, it is working stable and currently runs in productive
> > websites! It works for int, long, double, float and Date values (if
> encoded
> > using Date.getTime() as long).
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> > > -----Original Message-----
> > > From: Raf [mailto:r.ventag...@gmail.com]
> > > Sent: Friday, April 10, 2009 4:38 PM
> > > To: java-user@lucene.apache.org
> > > Subject: RangeFilter performance problem using MultiReader
> > >
> > > Hi,
> > > we are experiencing some problems using RangeFilters and we think
> there
> > > are
> > > some performance issues caused by MultiReader.
> > >
> > > We have more or less 3M documents in 24 indexes and we read all of
> them
> > > using a MultiReader.
> > > If we do a search using only terms, there are no problems, but it if
> we
> > > add
> > > to the same search terms a RangeFilter that extracts a large subset of
> > the
> > > documents (e.g. 500K), it takes a lot of time to execute (about 15s).
> > >
> > > In order to identify the problem, we have tried to consolidate the
> index:
> > > so
> > > now we have the same 3M docs in a single 10GB index.
> > > If we repeat the same search using this index, it takes only a small
> > > fraction of the previous time (about 2s).
> > >
> > > Is there something we can do to improve search performance using
> > > RangeFilters with MultiReader or the only solution is to have only a
> > > single
> > > big index?
> > >
> > > Thanks,
> > > Raf
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: RangeFilter performance problem using MultiReader

Reply via email to