[jira] [Commented] (SOLR-10256) Parentheses in SpellCheckCollator

James Dyer (JIRA) Mon, 13 Mar 2017 08:02:31 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907616#comment-15907616
 ]


James Dyer commented on SOLR-10256:
-----------------------------------

[~asingh2411]  Assuming you can't use the collation provided because of the 
added parenthesis, could you just specify 
"spellcheck.collateExtendedResults=true" and could your application use the 
information contained therein to craft a new query the way you want it?  This 
might be a good workaround until/if we decide to change the current behavior.

Really, instead of having a flag, I'd like to either keep it as-is, or fix it 
to work correctly for most common use cases, but not to have an obscure flag 
that users need to worry about.  You may want to review the "testCollate()" 
method in [WorkBreakSolrSpellChecker's unit 
test|https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/spelling/WordBreakSolrSpellCheckerTest.java],
 and suggest better outcomes for these queries with a little discussion as to 
_why_ your suggestions would be better.

The idea here is, if "pine" is a required term and the spellcheck breaks it to 
"pi ne", then both "pi" and "ne" should be required also.  But maybe this is 
not the best thing to try and enforce?

bq. And when surrounded by brackets, they represent the same position by 
EdismaxParser 

I'm trying to find some documentation somewhere that says this, or maybe a test 
case that demonstrates it?  I apologize for my ignorance on the ins and outs of 
edismax here.

> Parentheses in SpellCheckCollator
> ---------------------------------
>
>                 Key: SOLR-10256
>                 URL: https://issues.apache.org/jira/browse/SOLR-10256
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: spellchecker
>            Reporter: Abhishek Kumar Singh
>         Attachments: SOLR-10256.patch
>
>
> SpellCheckCollator adds parentheses ( *'('* and *')'* ) around tokens which 
> have space between them.  
> This should be configurable, because if *_WordBreakSpellCheckComponent_* is 
> being used, queries like : *applejuice* will be broken down to *apple juice*. 
> Such suggestions are being surrounded by braces by current 
> *SpellCheckCollator*. 
> And when surrounded by brackets, they represent the same position by 
> _EdismaxParser_ , which is not required. 
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java#L227
>   
> A solution to this will be to have a flag, which can help disable this 
> parenthesisation of spell check suggestions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-10256) Parentheses in SpellCheckCollator

Reply via email to