[
https://issues.apache.org/jira/browse/LUCENE-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921162#comment-13921162
]
Robert Muir commented on LUCENE-5490:
-------------------------------------
Also MAX_TERM_LENGTH is in utf-8 bytes, but this count is in utf-16 code units.
So I think MAX_TERM_LENGTH is not a great default.
MAX_TERM_LENGTH/3 would be better? This way if you use LengthFilter out of box
because you tried to index a video file or something (and this is likely with
java's defaults to contain many 3-byter 0xFFFD's), you wont ever hit the
IndexWriter limit.
> make LengthFilterFactory's min/max args have sensible defaults
> --------------------------------------------------------------
>
> Key: LUCENE-5490
> URL: https://issues.apache.org/jira/browse/LUCENE-5490
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Hoss Man
> Priority: Minor
>
> LengthFilterFactory's min/max args are currently required, but it seems like
> we could give them sensible defaults and make them optional...
> min = 0
> max = IndexWriter.MAX_TERM_LENGTH
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]