Tim, thanks for working on this!

Question: do we have some formal way of evaluating the sentence detector? Maybe 
we should come up with some dev set that would include examples from mimic...

Dima




On Sep 27, 2014, at 8:57, Miller, Timothy 
<timothy.mil...@childrens.harvard.edu> wrote:

> I have been working on the sentence detector newline issue, training a model 
> to probabilistically split sentences on newlines rather than forcing sentence 
> breaks. I have checked in a model to the repo under ctakes-core-res. I also 
> attached a patch to ctakes-core to the jira issue:
> https://issues.apache.org/jira/browse/CTAKES-41
> 
> for people to test. The status of my testing is that it doesn't seem to break 
> on notes where ctakes worked well before (those where newlines are always 
> sentence breaks), and is a slight improvement on notes where newlines may or 
> may not be sentence breaks. Once the change is checked in we can continue 
> improving the model by adding more data and features, but the first hurdle 
> I'd like to get past is making sure it runs well enough on the type of data 
> that the old model worked well on. Let me know if you have any questions.
> 
> Thanks
> Tim

Reply via email to