[
https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013932#comment-13013932
]
Dawid Weiss commented on SOLR-2378:
-----------------------------------
I didn't have time to take care of this until now, apologies. So, looking at
Lookup#lookup(), I just wanted to clarify:
{code}
/**
* Look up a key and return possible completion for this key.
* @param key lookup key. Depending on the implementation this may be
* a prefix, misspelling, or even infix.
* @param onlyMorePopular return only more popular results
* @param num maximum number of results to return
* @return a list of possible completions, with their relative weight (e.g.
popularity)
*/
public abstract List<LookupResult> lookup(String key, boolean
onlyMorePopular, int num);
{code}
the "onlyMorePopular" means more popular than... what? I see TSTLookup and
JaspellLookup (Andrzej, will you confirm, please?) sorts matches in a priority
queue by their associated value (frequency I guess). This makes sense, but
onlyMorePopular is misleading -- it should be called onlyMostPopular (those
with the native knowledge of English subtlieties, speak up if I'm right here).
I also see and wanted to confirm -- the Dictionary can come from various
sources, so we can't rely on the presence of the built-in Lucene automaton, can
we? Even if I wanted to reuse it, there'd be no easy way to determine if it's a
full automaton, or a partial one (because of the gaps/trimming)... I think I'll
just implement the solution by building the automaton from whatever Dictionary
comes in and serializing/ deserializing it similar to TSTLookup.
Sounds ok?
> FST-based Lookup (suggestions) for prefix matches.
> --------------------------------------------------
>
> Key: SOLR-2378
> URL: https://issues.apache.org/jira/browse/SOLR-2378
> Project: Solr
> Issue Type: New Feature
> Components: spellchecker
> Reporter: Dawid Weiss
> Assignee: Dawid Weiss
> Labels: lookup, prefix
> Fix For: 4.0
>
>
> Implement a subclass of Lookup based on finite state automata/ transducers
> (Lucene FST package). This issue is for implementing a relatively basic
> prefix matcher, we will handle infixes and other types of input matches
> gradually.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]