That's certainly valid. You could also consider n-grams here as another
approach.

Its also useful to restrict the number of leading (or trailing) characters
you allow. For
instance, requiring at least 3 non-wildcard leading characters makes a big
difference.
It's also a legitimate question how well users are served by, say, a*. The
results are
often too broad to be useful...

Best
Erick

On Thu, Jan 20, 2011 at 8:48 AM, comparis.ch - Roman Baeriswyl <
roman.baeris...@comparis.ch> wrote:

> Thanks for the answer. That does make sense.
> It first gets thru all (not only those which could pass the filter) terms
> available and investigates all terms which match any of the wildcard
> queries. And that could take quite some time if I got leading wildcard
> queries.
>
> Guess I'll try another approach then with "reverse" indexing the fields
> which can be searched and then transform the leading wildcard queries to
> following wildcards by reversing them too.
>
> -----Original Message-----
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Donnerstag, 20. Januar 2011 14:29
> To: java-user@lucene.apache.org
> Subject: RE: Filter Performance
>
> The reason for this is that the filters and other boolean clauses are
> applied during result collection. But wildcard query first needs to
> investigate all terms that match and this is done before the results are
> collected. And this step takes the time (especially before Lucene 4.0).
>
> There is no way to change this.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -----Original Message-----
> > From: comparis.ch - Roman Baeriswyl [mailto:roman.baeris...@comparis.ch]
> > Sent: Thursday, January 20, 2011 10:50 AM
> > To: 'java-user@lucene.apache.org'
> > Subject: Filter Performance
> >
> > Hi all
> >
> > I've got an Index with a few 100k documents and I want to run a rather
> > complex wildcard (incl. leading wildcards) query on it.
> > The wildcard query takes about 2 seconds to complete.
> > Now, I want to limit the items on which the wildcard query will be
> executed.
> > Let's say, I want to limit the items to those which have the field
> > "ProductName" set to "milk" (this query itself runs in less than 5
> milliseconds
> > and returns about 100 items).
> > So, I tried different things but everything resulted in having the exact
> same
> > execution time like without this  filter. Even if I run the query
> multiple
> times
> > in a row with the same Query and Filter items.
> >
> > Here's what I tried:
> >
> > -          Adding "+ProductName:milk" to the Query
> >
> > -          Added FieldCacheTermsFilter("ProductName ", new String[] {
> "milk" })
> > as Filter
> >
> > -          Wrapped the Filter in CachingWrapperFilter
> >
> > -          Used  BooleanQuery filterQuery = new BooleanQuery();
> > filterQuery.add(new TermQuery(new Term("ProductName", "milk ")),
> > BooleanClause.Occur.MUST); and wrapped it with a QueryWrapperFilter and
> > also tried wrapping it in a FilteredQuery
> >
> > Nothing improved the searchspeed, it had always the same speed as when
> > he parsed thru all documents without any pre-filtering.
> >
> > Is this pre-filtering different than I thought? Am I doing something
> wrong?
> > Does the index need to be build somehow special for this to work?
> >
> > Thanks for your help
> > //Roman
> >
> > ________________________________
> > Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil:
> > http://www.facebook.com/pages/Preissturz/218831069608
> >
> > Die besten Elektronik-Aktionen auf Twitter:
> http://twitter.com/preissturz1
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil:
> http://www.facebook.com/pages/Preissturz/218831069608
>
> Die besten Elektronik-Aktionen auf Twitter: http://twitter.com/preissturz1
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to