[ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924391#action_12924391
 ] 

Yonik Seeley edited comment on SOLR-2010 at 10/24/10 6:53 PM:
--------------------------------------------------------------

bq. James, I did the merge back to 3.x.

FYI, you missed Robert's resource leak fixes to SpellCheckCollatorTest.
Not sure what best practice is to catch stuff like this... if it's only a file 
or two, I guess check the history of each?

edit: actually your backport to 3x didn't even touch SpellCheckCollatorTest.  I 
was misled by the fact that when you look at the history of 
SpellCheckCollatorTest, it shows an update.  But I guess it was just merge 
properties.  Ugh.

{noformat}
yo...@wolverine /cygdrive/c/code/lusolr_3x
$ svn log ./solr/src/test/org/apache/solr/spelling/SpellCheckCollatorTest.java
------------------------------------------------------------------------
r1026000 | gsingers | 2010-10-21 09:48:34 -0400 (Thu, 21 Oct 2010) | 1 line

SOLR-2010, including Yonik's fix, SOLR-2181 -- hope I did this merge correctly
------------------------------------------------------------------------
r1021439 | gsingers | 2010-10-11 13:32:11 -0400 (Mon, 11 Oct 2010) | 1 line

SOLR-2010: added richer support for spell checking collations
------------------------------------------------------------------------

yo...@wolverine /cygdrive/c/code/lusolr_3x
$ svn diff -r 1021439:1026000 
./solr/src/test/org/apache/solr/spelling/SpellCheckCollatorTest.java            
                                       
yo...@wolverine /cygdrive/c/code/lusolr_3x
{noformat}

I'm in the process of getting branch_3x to pass the searcher open/close test, 
so I'll handle this.

      was (Author: [email protected]):
    bq. James, I did the merge back to 3.x.

FYI, you missed Robert's resource leak fixes to SpellCheckCollatorTest.
Not sure what best practice is to catch stuff like this... if it's only a file 
or two, I guess check the history of each?

I'm in the process of getting branch_3x to pass the searcher open/close test, 
so I'll handle this.
  
> Improvements to SpellCheckComponent Collate functionality
> ---------------------------------------------------------
>
>                 Key: SOLR-2010
>                 URL: https://issues.apache.org/jira/browse/SOLR-2010
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, spellchecker
>    Affects Versions: 1.4.1
>         Environment: Tested against trunk revision 966633
>            Reporter: James Dyer
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: multiple_collations_as_an_array.patch, SOLR-2010.patch, 
> SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.txt, 
> SOLR-2010_141.patch, SOLR-2010_141.patch, 
> SOLR-2010_shardRecombineCollations_993538.patch, 
> SOLR-2010_shardRecombineCollations_999521.patch, 
> SOLR-2010_shardSearchHandler_993538.patch, 
> SOLR-2010_shardSearchHandler_999521.patch, solr_2010_3x.patch
>
>
> Improvements to SpellCheckComponent Collate functionality
> Our project requires a better Spell Check Collator.  I'm contributing this as 
> a patch to get suggestions for improvements and in case there is a broader 
> need for these features.
> 1. Only return collations that are guaranteed to result in hits if re-queried 
> (applying original fq params also).  This is especially helpful when there is 
> more than one correction per query.  The 1.4 behavior does not verify that a 
> particular combination will actually return hits.
> 2. Provide the option to get multiple collation suggestions
> 3. Provide extended collation results including the # of hits re-querying 
> will return and a breakdown of each misspelled word and its correction.
> This patch is similar to what is described in SOLR-507 item #1.  Also, this 
> patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
> dictionary could be created that combines the terms from the multiple fields. 
>  The collator then would prune out any spurious suggestions this would cause.
> This patch adds the following spellcheck parameters:
> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
> before giving up.  Lower values ensure better performance.  Higher values may 
> be necessary to find a collation that can return results.  Default is 0, 
> which maintains backwards-compatible behavior (do not check collations).
> 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
> 1, which maintains backwards-compatible behavior.
> 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
> format detailing collations found.  default is false, which maintains 
> backwards-compatible behavior.  When true, output is like this (in context):
> <lst name="spellcheck">
>       <lst name="suggestions">
>               <lst name="hopq">
>                       <int name="numFound">94</int>
>                       <int name="startOffset">7</int>
>                       <int name="endOffset">11</int>
>                       <arr name="suggestion">
>                               <str>hope</str>
>                               <str>how</str>
>                               <str>hope</str>
>                               <str>chops</str>
>                               <str>hoped</str>
>                               etc
>                       </arr>
>               <lst name="faill">
>                       <int name="numFound">100</int>
>                       <int name="startOffset">16</int>
>                       <int name="endOffset">21</int>
>                       <arr name="suggestion">
>                               <str>fall</str>
>                               <str>fails</str>
>                               <str>fail</str>
>                               <str>fill</str>
>                               <str>faith</str>
>                               <str>all</str>
>                               etc
>                       </arr>
>               </lst>
>               <lst name="collation">
>                       <str name="collationQuery">Title:(how AND fails)</str>
>                       <int name="hits">2</int>
>                       <lst name="misspellingsAndCorrections">
>                               <str name="hopq">how</str>
>                               <str name="faill">fails</str>
>                       </lst>
>               </lst>
>               <lst name="collation">
>                       <str name="collationQuery">Title:(hope AND faith)</str>
>                       <int name="hits">2</int>
>                       <lst name="misspellingsAndCorrections">
>                               <str name="hopq">hope</str>
>                               <str name="faill">faith</str>
>                       </lst>
>               </lst>
>               <lst name="collation">
>                       <str name="collationQuery">Title:(chops AND all)</str>
>                       <int name="hits">1</int>
>                       <lst name="misspellingsAndCorrections">
>                               <str name="hopq">chops</str>
>                               <str name="faill">all</str>
>                       </lst>
>               </lst>
>       </lst>
> </lst>
> In addition, SOLRJ is updated to include 
> SpellCheckResponse.getCollatedResults(), which will return the expanded 
> Collation format.  getCollatedResult(), which returns a single String, is 
> retained for backwards-compatibility.  Other APIs were not changed but will 
> still work provided that spellcheck.collateExtendedResult is false.
> This likely will not return valid results if using Shards.  Rather, a more 
> robust interaction with the index would be necessary than what exists in 
> SpellCheckCollator.collate().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to