[jira] [Commented] (LUCENE-7762) Add EnglishAnalyzer.setMaxTokenLength

Robert Muir (JIRA) Fri, 31 Mar 2017 03:58:27 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950677#comment-15950677
 ]


Robert Muir commented on LUCENE-7762:
-------------------------------------

We did a lot of work to remove analyzer customizations and options. Really they 
should be "examples" and you should use CustomAnalyzer if you want to tweak 
behavior.

Otherwise we run into lots of backwards-compatibility issues. Or cases like 
this one, why should EnglishAnalyzer's api be bound to StandardTokenizer at 
all? It should not show its cards, these things make it hard/impossible to 
improve it later. And why just EnglishAnalyzer? If its gonna show its cards, 
why shouldnt all the other StandardTokenizer-using analyzers show their cards 
too? I think consistency is important.

these analyzers are still defined with java code (versus configuration), but 
this is also not good. Such options make it hard to improve them from that 
perspective too. 

And really the only reason a setter is wanted is because they are defined with 
java code today. If they weren't, be honest, you'd just tweak the configuration.

I'm not sure we should do this for all these reasons.

> Add EnglishAnalyzer.setMaxTokenLength
> -------------------------------------
>
>                 Key: LUCENE-7762
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7762
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: master (7.0), 6.6
>
>
> I think EnglishAnalyzer should also let you change the default (255) max 
> token length of the StandardTokenizer its invoking.
> I will also fold the javadoc fixes from LUCENE-7760 here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7762) Add EnglishAnalyzer.setMaxTokenLength

Reply via email to