Erik Hatcher wrote:
On Apr 4, 2006, at 11:23 AM, Fisheye wrote:
Probably, do you know if there is a possibility to get the similar words
generated by the algorithm when doing fuzzy search?
Well, a roundabout way is to simply create a FuzzyQuery, rewrite it,
cast it to a BooleanQuery and use the BooleanQuery API to extract the
TermQuery objects and the Term within the TermQuery has what you're
looking for. That's actually not a bad way to go, but you could also go
more low-level and borrow the technique used under FuzzyQuery itself:
<http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/apache/lucene/search/FuzzyTermEnum.java>
We take an approach somewhere down the middle...
IndexReader reader = ...;
FuzzyQuery q = ...
FilteredTermEnum enum = q.getEnum(reader);
The advantage of this method is that it's easier to generalise (works
for any subclass of MultiTermQuery, not just FuzzyQuery), while not
needing any rewriting (which may eat more memory, although I can't say
for sure.)
In fact our own code takes any query and looks at the type of it to
extract terms from it, potentially recursively if it encounters a
BooleanQuery. It would be Really Nice [TM] if Lucene had a method on
the Query class to do this directly. :-)
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia Ph: +61 2 9280 0699
Web: http://www.nuix.com.au/ Fax: +61 2 9212 6902
This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]