We're running version 4.0.0.1 on ~12K notes. The first time we ran it I got
a heap space error at ~10.5k notes processed (at about ~38 hours).

I increased the heap space params and then reran. This time it died at the
same place, but with a different error (see below):

SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
        at
org.apache.ctakes.contexttokenizer.ae.ContextDependentTokenizerAnnotator.process(ContextDependentTokenizerAnnotator.java:105)
        at
org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
        at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396)
        at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314)
        at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
        at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
        at
org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
        at
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
...
Caused by: java.lang.NumberFormatException: For input string: "f"
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
        at java.lang.Double.parseDouble(Double.java:538)
--------

Thus, it looks like a string is being detected as a float. This had worked
in version 4.0.1, so it must have been fixed at some point. Even after I
made changes for the new NLM authentication for UMLS and tested it in 4.0.1
based on Peter's authentication solution, it stopped working after January
15th.  Unfortunately, we're not set up to compile 4.0.1.

That being said, does someone have a working version of 4.0.1 built from
the trunk? If so, could you please send me a copy?

If not, how can I find the offending file?

This is kind of critical, since we're in the middle of an experiment and
another side effect of reverting to 4.0.0.1 is it is a LOT slower than
4.0.1.

Thanks very much in advance!

Greg--

-- 
Greg M. Silverman
Senior Systems Developer
NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
Department of Surgery
University of Minnesota
g...@umn.edu

Reply via email to