Maybe this helps you, but read the docs, it will work only with single-value-fields: http://lucene.apache.org/java/2_9_1/api/core/org/apache/lucene/search/FieldC acheTermsFilter.html
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Eran Sevi [mailto:erans...@gmail.com] > Sent: Sunday, November 22, 2009 3:49 PM > To: java-user@lucene.apache.org > Subject: Efficient filtering advise > > Hi, > > I have a need to filter my queries using a rather large subset of terms > (can > be 10K or even 50K). > All these terms are sure to exist in the index so the number of results > can > be about the same number of terms in the filter. > The terms are numbers but are not subsequent and are from a large set of > possible values (so range queries are probably not good for me). > The index itself is about 1M docs and running even a simple query with > such > a large filter takes a lot of time even if the number of results is only a > few hundred docs. > It seems like the speed is affected by the length of the filter even if > the > number of results remains more or less the same, which is logical but not > by > such a large loss of performance as I'm experiencing (running the query > with > a 10K terms filter takes an average of 1s 187ms with 600 results while > running it with a 50K terms filter takes an average of 5s 207ms with 1000 > results). > > Currently I'm using a QueryFilter with a boolean query in which I "OR" the > different terms together. > I also can't use a cached filter efficiently since the terms to filter on > change almost every query. > > I was wondering if there's a better way to filter my queries so they won't > take a few seconds to run? > > Thanks in advance for any advise, > Eran. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org