[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

Dawid Weiss (JIRA) Thu, 31 Mar 2011 04:33:50 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013932#comment-13013932
 ]


Dawid Weiss commented on SOLR-2378:
-----------------------------------

I didn't have time to take care of this until now, apologies. So, looking at 
Lookup#lookup(), I just wanted to clarify:

{code}
  /**
   * Look up a key and return possible completion for this key.
   * @param key lookup key. Depending on the implementation this may be
   * a prefix, misspelling, or even infix.
   * @param onlyMorePopular return only more popular results
   * @param num maximum number of results to return
   * @return a list of possible completions, with their relative weight (e.g. 
popularity)
   */
  public abstract List<LookupResult> lookup(String key, boolean 
onlyMorePopular, int num);
{code}

the "onlyMorePopular" means more popular than... what? I see TSTLookup and 
JaspellLookup (Andrzej, will you confirm, please?) sorts matches in a priority 
queue by their associated value (frequency I guess). This makes sense, but 
onlyMorePopular is misleading -- it should be called onlyMostPopular (those 
with the native knowledge of English subtlieties, speak up if I'm right here).

I also see and wanted to confirm -- the Dictionary can come from various 
sources, so we can't rely on the presence of the built-in Lucene automaton, can 
we? Even if I wanted to reuse it, there'd be no easy way to determine if it's a 
full automaton, or a partial one (because of the gaps/trimming)... I think I'll 
just implement the solution by building the automaton from whatever Dictionary 
comes in and serializing/ deserializing it similar to TSTLookup.

Sounds ok?





> FST-based Lookup (suggestions) for prefix matches.
> --------------------------------------------------
>
>                 Key: SOLR-2378
>                 URL: https://issues.apache.org/jira/browse/SOLR-2378
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>              Labels: lookup, prefix
>             Fix For: 4.0
>
>
> Implement a subclass of Lookup based on finite state automata/ transducers 
> (Lucene FST package). This issue is for implementing a relatively basic 
> prefix matcher, we will handle infixes and other types of input matches 
> gradually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

Reply via email to