Re: CustomBreakIterator Performance Issues

2021-03-08 Thread David Smiley
The BreakIterator impls in the JDK (and likely IBM ICU) seem slow and can sometimes dominate the performance of this highlighter. I worked on a large search project (which led to the creation of the UnifiedHighlighter) and we used a technique of encoding the breaks directly into the text a-priori.

Re: CustomBreakIterator Performance Issues

2021-03-08 Thread df2832368_...@amberoad.de df2832368_...@amberoad.de
And of cource the link broke : https://drive.google.com/file/d/1wfZFQD6loTeA9_-eGrdwi9YGtJcNjKli/view?usp=sharing > df2832368_...@amberoad.de df2832368_...@amberoad.de > hat am 08.03.2021 11:05 geschrieben: > > > Hello, > > I am currently working on getting a custom BreakIterator

CustomBreakIterator Performance Issues

2021-03-08 Thread df2832368_...@amberoad.de df2832368_...@amberoad.de
Hello, I am currently working on getting a custom BreakIterator for the Unified Highlighter to work, and struggle a bit performance wise. I need a BreakIterator for getting nice highlights of passages. For this I want the start of the highlight to be a sentence-start and the end to be a word-en