[
https://issues.apache.org/jira/browse/LUCENE-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026299#comment-14026299
]
Robert Muir commented on LUCENE-5749:
-------------------------------------
Right, its zero lines of code actually. These analyzers are just
"default/example" chains. They arent code.
Its just not feasible or even wanted to add such options to them, its too
difficult to maintain. The current analyzers already look hellacious because of
the existing constraints like backwards compatibility that we have to lug
around for years.
Personally, stuff like back compat on what are just default definitions,
Versions, stopword options, etc totally discourages me from improving any of
the existing analyzers (I would rather avoid the hassle), even though quite a
few aren't in great shape and could use better defaults or algorithms.
if you want to do something expert like change the default stemming algorithm,
please define your own chain. Its really not that hard.
> analyzers should be further customizable to allow for better code reuse
> -----------------------------------------------------------------------
>
> Key: LUCENE-5749
> URL: https://issues.apache.org/jira/browse/LUCENE-5749
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 4.8.1
> Environment: All
> Reporter: Jamie
> Priority: Minor
> Labels: analyzers
>
> To promote code reuse, the customizability of the analyzers included with
> Lucene (e.g. EnglishAnalyzer) ought to be further improved.
> To illustrate, it is currently difficult to specify general stemming behavior
> without having to modify each and every analyzer class. In our case, we had
> to change the constructors of every analyzer class to accept an
> AnalyzerOption argument.
> The AnalyzerOption class has a getStemStrategy() method. StemStrategy is
> defined as follows:
> public enum StemStrategy { AGGRESSIVE, LIGHT, NONE };
> We needed to modify over 20 or so Lucene classes. This is obviously not ideal
> from a code reuse and maintainability standpoint.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]