[
https://issues.apache.org/jira/browse/LUCENE-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-5111:
--------------------------------
Attachment: LUCENE-5111.patch
here is a patch. Its not super-optimized, but the 3 common conditions (no
delimiters, all delimiters, just one word surrounded by delimiters) are just as
fast. for the concatenation+parts stuff I used captureState (we can avoid it,
it was just about correctness for me).
I think this is fairly important to fix so users can use e.g. postings
highlighter and don't hit bugs like
http://stackoverflow.com/questions/20324016/shingle-filter-factory-startoffset-must-be-non-negative-and-endoffset-must-be
> Fix WordDelimiterFilter
> -----------------------
>
> Key: LUCENE-5111
> URL: https://issues.apache.org/jira/browse/LUCENE-5111
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Attachments: LUCENE-5111.patch
>
>
> WordDelimiterFilter is documented as broken is TestRandomChains
> (LUCENE-4641). Given how used it is, we should try to fix it.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]