[
https://issues.apache.org/jira/browse/SOLR-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907616#comment-15907616
]
James Dyer commented on SOLR-10256:
-----------------------------------
[~asingh2411] Assuming you can't use the collation provided because of the
added parenthesis, could you just specify
"spellcheck.collateExtendedResults=true" and could your application use the
information contained therein to craft a new query the way you want it? This
might be a good workaround until/if we decide to change the current behavior.
Really, instead of having a flag, I'd like to either keep it as-is, or fix it
to work correctly for most common use cases, but not to have an obscure flag
that users need to worry about. You may want to review the "testCollate()"
method in [WorkBreakSolrSpellChecker's unit
test|https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/spelling/WordBreakSolrSpellCheckerTest.java],
and suggest better outcomes for these queries with a little discussion as to
_why_ your suggestions would be better.
The idea here is, if "pine" is a required term and the spellcheck breaks it to
"pi ne", then both "pi" and "ne" should be required also. But maybe this is
not the best thing to try and enforce?
bq. And when surrounded by brackets, they represent the same position by
EdismaxParser
I'm trying to find some documentation somewhere that says this, or maybe a test
case that demonstrates it? I apologize for my ignorance on the ins and outs of
edismax here.
> Parentheses in SpellCheckCollator
> ---------------------------------
>
> Key: SOLR-10256
> URL: https://issues.apache.org/jira/browse/SOLR-10256
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: spellchecker
> Reporter: Abhishek Kumar Singh
> Attachments: SOLR-10256.patch
>
>
> SpellCheckCollator adds parentheses ( *'('* and *')'* ) around tokens which
> have space between them.
> This should be configurable, because if *_WordBreakSpellCheckComponent_* is
> being used, queries like : *applejuice* will be broken down to *apple juice*.
> Such suggestions are being surrounded by braces by current
> *SpellCheckCollator*.
> And when surrounded by brackets, they represent the same position by
> _EdismaxParser_ , which is not required.
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java#L227
>
> A solution to this will be to have a flag, which can help disable this
> parenthesisation of spell check suggestions.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]