FuzzyQuery performance is related to number of unique terms in the index not 
the number of documents e.g. a single "telephone directory" document could 
contain millions of terms.
Each term considered is compared using an "edit distance" algo which is CPU 
intensive.

The FuzzyQuery prefix length setting dictates if the fuzzy edit distance 
comparisons are done from A to Z (prefix length=0) or just those terms sharing 
the first n characters of the input term. Obviously this can make a huge 
difference in number of terms compared (prefix length of 1 would reduce search 
space to 1/26th of prefix length =0 assuming even distribution of words in the 
alphabet).

Your prefix query does a simpler operation - the equivalent of 
String.startsWith(..) and will typically operate on fewer terms.

Cheers
Mark



----- Original Message ----
From: Erick Erickson <erickerick...@gmail.com>
To: java-user@lucene.apache.org
Sent: Monday, 15 June, 2009 15:34:18
Subject: Re: Fuzzy vs Prefix query Performance

Well, if you're seeing it, it's possible <G>....

But the first question is always "what were you measuring?" Be aware
that when you open a searcher, the first few queries can fill caches, etc
and
may take an anomalously long time, especially if you're sorting. So could
you give more details of your test setup?

Best
Erick

On Mon, Jun 15, 2009 at 3:19 PM, Zsolt Koppany <zkoppanyl...@intland.com>wrote:

> Hi,
>
> on 99470 documents (I mean Lucene documents) a FuzzyQuery needs approx 30
> seconds but PrefixQuery less than one.
>
> All Lucene files need 65MB together.
>
> I'm bit surprised of that. Is that possible?
>
> Zsolt
>
> Zsolt Koppany
> Phone: +49-711-67400-679
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to