Hi Jong, Jong Kim wrote: > I'm looking for a stemmer that is capable of returning all morphological > variants of a query term (to be used for high-recall search). For example, > given a query term of 'cares', I would like to be able to generate 'cares', > 'care', 'cared', and 'caring'.
To achieve high recall, can't you just stem terms both when indexing and when querying? The only reasons I can think of not to do so are to support both high-precision and high-recall queries with the same index, or to give greater weight to exact-match documents than to stemmed-match documents within a single query. If either of these is the case, you could maintain two fields (one stemmed and one non-stemmed) or two indexes, and choose which field/index to use (reason #1) or combine the two in a single query (reason #2). Actually, now that I think about it more, two indexes for reason #2 (i.e., stemmed match as fallback if exact match fails) would be tricky, due to issues with fusion of search results -- better to use two fields in a single index for this case. Steve --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]