RE: Ctakes to process 5000K recoreds

Nick Nikandish Tue, 09 Sep 2014 12:03:12 -0700

Pei,
I need the name of the medications for the application that I wrote and uses 
ctakes.....so I cache the medication in DictionaryLookupAnnotator(in 
performLookup()) and use them in my program but when I have 
SimpleSegementAnnotator it just takes forever. After taking 
SimpleSegementAnnotator out, no medication name in DictionaryLookupAnnotator is 
returned in the code. So I was wondering if there was a way that I could 
eliminate SimpleSegementAnnotator but still be  able to get the medications 
name in that class?

Nick

-----Original Message-----
From: Pei Chen [mailto:chen...@apache.org] 
Sent: Tuesday, September 09, 2014 2:54 PM
To: dev@ctakes.apache.org
Subject: Re: Ctakes to process 5000K recoreds

Nick,
When you mean no medication is being annotated, I presume you mean the 
medication attributes (i.e. dosage, frequency, etc.) are not being annotated?  
I think the DrugNER needs a list of section names in the config; I think it 
includes SIMPLE_SEGMENT.  I am very surprised that SimpleSegementAnnotator is 
the bottle neck though; all it does is assume the entire document is a single 
section called SIMPLE_SEGMENT.
Have you tried commenting out the DependencyParser if you're not using those 
features.

--Pei


On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish <snika...@emerginghealthit.com> 
wrote:
>
> Hi there,
>
> I am using Ctakes to process 5000K free text  records  where each record has 
> several medications.
> This is the fixed flow that it goes through:
>
>                                                                
> <node>SimpleSegmentAnnotator</node>
>                                                                 
> <node>SentenceDetectorAnnotator</node>
>                                                                 
> <node>TokenizerAnnotator</node>
>                                                                 
> <node>LvgAnnotator</node>
>                                                                 
> <node>ContextDependentTokenizerAnnotator</node>
>                                                                 
> <node>POSTagger</node>
>                                                                 
> <node>Chunker</node>
>                                                                 
> <node>LookupWindowAnnotator</node>
>                                                                 
> <node>DictionaryLookupAnnotatorDB</node>
>                                                                 
> <node>DependencyParser</node>
>                                                                 
> <node>AssertionAnnotator</node>
>                                                                 
> <node>ExtractionPrepAnnotator</node>
>
> But it takes very very long time to process that many data( maybe a week or 
> so) when I use SimpleSegmentAnnotator.  By eliminating SimpleSegmentAnnotator 
> the process is very fast but no medication is being anotated.  Do you guys 
> have any suggestion?
>
> Thanks,
> Nick
>

RE: Ctakes to process 5000K recoreds

Reply via email to