[
https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052246#comment-13052246
]
Robert Muir commented on LUCENE-2341:
-------------------------------------
Hi MichaĆ,
This patch looks great!
I took a quick glance, here are a couple suggestions:
* In the MorfologikFilter, I think we should implement reset(), first calling
the superclass reset(), then clearing the stemsAcc list. This ensures that all
of the filter's state is cleared before it is reused. Under normal operations,
this should not be necessary, but some consumers in Lucene (e.g.
LimitTokenCountFilter, and some similar code in the Highlighter), will only
partially consume up to some point, then suddenly stop. By clearing this list
in reset() we ensure that there is no chance any leftover stems will appear in
the next stream.
* because the data is licensed under MPL, I think we should explicitly list a
hyperlink if possible to the source code used in the NOTICE.txt. I saw you
included some wordage in LICENSE.txt but I think this should only say 'XYZ data
is under this license, with the actual MPL license text. In the NOTICE.txt we
should link to the source code I think... there is some more information on
this under the section Category B: Reciprocal Licenses at
http://www.apache.org/legal/3party.html
> explore morfologik integration
> ------------------------------
>
> Key: LUCENE-2341
> URL: https://issues.apache.org/jira/browse/LUCENE-2341
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Dawid Weiss
> Attachments: LUCENE-2341.diff, morfologik-stemming-1.5.0.jar
>
>
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer
> available:
> http://sourceforge.net/projects/morfologik/
> This works differently than LUCENE-2298, and ideally would be another option
> for users.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]