[ 
https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052246#comment-13052246
 ] 

Robert Muir commented on LUCENE-2341:
-------------------------------------

Hi MichaƂ,

This patch looks great!

I took a quick glance, here are a couple suggestions:
* In the MorfologikFilter, I think we should implement reset(), first calling 
the superclass reset(), then clearing the stemsAcc list. This ensures that all 
of the filter's state is cleared before it is reused. Under normal operations, 
this should not be necessary, but some consumers in Lucene (e.g. 
LimitTokenCountFilter, and some similar code in the Highlighter), will only 
partially consume up to some point, then suddenly stop. By clearing this list 
in reset() we ensure that there is no chance any leftover stems will appear in 
the next stream.
* because the data is licensed under MPL, I think we should explicitly list a 
hyperlink if possible to the source code used in the NOTICE.txt. I saw you 
included some wordage in LICENSE.txt but I think this should only say 'XYZ data 
is under this license, with the actual MPL license text. In the NOTICE.txt we 
should link to the source code I think... there is some more information on 
this under the section Category B: Reciprocal Licenses at 
http://www.apache.org/legal/3party.html


> explore morfologik integration
> ------------------------------
>
>                 Key: LUCENE-2341
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2341
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Dawid Weiss
>         Attachments: LUCENE-2341.diff, morfologik-stemming-1.5.0.jar
>
>
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer 
> available:
> http://sourceforge.net/projects/morfologik/
> This works differently than LUCENE-2298, and ideally would be another option 
> for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to