Erik Hatcher wrote:

On Apr 4, 2006, at 11:23 AM, Fisheye wrote:
Probably, do you know if there is a possibility to get the similar words
generated by the algorithm when doing fuzzy search?

Well, a roundabout way is to simply create a FuzzyQuery, rewrite it, cast it to a BooleanQuery and use the BooleanQuery API to extract the TermQuery objects and the Term within the TermQuery has what you're looking for. That's actually not a bad way to go, but you could also go more low-level and borrow the technique used under FuzzyQuery itself:

<http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/apache/lucene/search/FuzzyTermEnum.java>

We take an approach somewhere down the middle...

    IndexReader reader = ...;
    FuzzyQuery q = ...

    FilteredTermEnum enum = q.getEnum(reader);

The advantage of this method is that it's easier to generalise (works for any subclass of MultiTermQuery, not just FuzzyQuery), while not needing any rewriting (which may eat more memory, although I can't say for sure.)

In fact our own code takes any query and looks at the type of it to extract terms from it, potentially recursively if it encounters a BooleanQuery. It would be Really Nice [TM] if Lucene had a method on the Query class to do this directly. :-)

Daniel


--
Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://www.nuix.com.au/                        Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to