On 05/21/2013 08:00 PM, Steven Bethard wrote:
So perhaps we could re-train it to disambiguate newline characters as well?
Yes, the OpenNLP Sentence Detector now supports that in the new 1.5.3 version out of the box, you can specify the set of EOS chars to use, but the default is still: !?. If you have special needs you can also customize the feature generation. It should probably be possible to drop the cTAKES eos fix for that now.
Let me know if you have any question or need some help to customize it for cTAKES.
Jörn