That's certainly valid. You could also consider n-grams here as another approach.
Its also useful to restrict the number of leading (or trailing) characters you allow. For instance, requiring at least 3 non-wildcard leading characters makes a big difference. It's also a legitimate question how well users are served by, say, a*. The results are often too broad to be useful... Best Erick On Thu, Jan 20, 2011 at 8:48 AM, comparis.ch - Roman Baeriswyl < roman.baeris...@comparis.ch> wrote: > Thanks for the answer. That does make sense. > It first gets thru all (not only those which could pass the filter) terms > available and investigates all terms which match any of the wildcard > queries. And that could take quite some time if I got leading wildcard > queries. > > Guess I'll try another approach then with "reverse" indexing the fields > which can be searched and then transform the leading wildcard queries to > following wildcards by reversing them too. > > -----Original Message----- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Donnerstag, 20. Januar 2011 14:29 > To: java-user@lucene.apache.org > Subject: RE: Filter Performance > > The reason for this is that the filters and other boolean clauses are > applied during result collection. But wildcard query first needs to > investigate all terms that match and this is done before the results are > collected. And this step takes the time (especially before Lucene 4.0). > > There is no way to change this. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: comparis.ch - Roman Baeriswyl [mailto:roman.baeris...@comparis.ch] > > Sent: Thursday, January 20, 2011 10:50 AM > > To: 'java-user@lucene.apache.org' > > Subject: Filter Performance > > > > Hi all > > > > I've got an Index with a few 100k documents and I want to run a rather > > complex wildcard (incl. leading wildcards) query on it. > > The wildcard query takes about 2 seconds to complete. > > Now, I want to limit the items on which the wildcard query will be > executed. > > Let's say, I want to limit the items to those which have the field > > "ProductName" set to "milk" (this query itself runs in less than 5 > milliseconds > > and returns about 100 items). > > So, I tried different things but everything resulted in having the exact > same > > execution time like without this filter. Even if I run the query > multiple > times > > in a row with the same Query and Filter items. > > > > Here's what I tried: > > > > - Adding "+ProductName:milk" to the Query > > > > - Added FieldCacheTermsFilter("ProductName ", new String[] { > "milk" }) > > as Filter > > > > - Wrapped the Filter in CachingWrapperFilter > > > > - Used BooleanQuery filterQuery = new BooleanQuery(); > > filterQuery.add(new TermQuery(new Term("ProductName", "milk ")), > > BooleanClause.Occur.MUST); and wrapped it with a QueryWrapperFilter and > > also tried wrapping it in a FilteredQuery > > > > Nothing improved the searchspeed, it had always the same speed as when > > he parsed thru all documents without any pre-filtering. > > > > Is this pre-filtering different than I thought? Am I doing something > wrong? > > Does the index need to be build somehow special for this to work? > > > > Thanks for your help > > //Roman > > > > ________________________________ > > Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil: > > http://www.facebook.com/pages/Preissturz/218831069608 > > > > Die besten Elektronik-Aktionen auf Twitter: > http://twitter.com/preissturz1 > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil: > http://www.facebook.com/pages/Preissturz/218831069608 > > Die besten Elektronik-Aktionen auf Twitter: http://twitter.com/preissturz1 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >