Nick, When you mean no medication is being annotated, I presume you mean the medication attributes (i.e. dosage, frequency, etc.) are not being annotated? I think the DrugNER needs a list of section names in the config; I think it includes SIMPLE_SEGMENT. I am very surprised that SimpleSegementAnnotator is the bottle neck though; all it does is assume the entire document is a single section called SIMPLE_SEGMENT. Have you tried commenting out the DependencyParser if you're not using those features.
--Pei On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish <snika...@emerginghealthit.com> wrote: > > Hi there, > > I am using Ctakes to process 5000K free text records where each record has > several medications. > This is the fixed flow that it goes through: > > > <node>SimpleSegmentAnnotator</node> > > <node>SentenceDetectorAnnotator</node> > > <node>TokenizerAnnotator</node> > > <node>LvgAnnotator</node> > > <node>ContextDependentTokenizerAnnotator</node> > > <node>POSTagger</node> > > <node>Chunker</node> > > <node>LookupWindowAnnotator</node> > > <node>DictionaryLookupAnnotatorDB</node> > > <node>DependencyParser</node> > > <node>AssertionAnnotator</node> > > <node>ExtractionPrepAnnotator</node> > > But it takes very very long time to process that many data( maybe a week or > so) when I use SimpleSegmentAnnotator. By eliminating SimpleSegmentAnnotator > the process is very fast but no medication is being anotated. Do you guys > have any suggestion? > > Thanks, > Nick >