We're running version 4.0.0.1 on ~12K notes. The first time we ran it I got a heap space error at ~10.5k notes processed (at about ~38 hours).
I increased the heap space params and then reran. This time it died at the same place, but with a different error (see below): SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.contexttokenizer.ae.ContextDependentTokenizerAnnotator.process(ContextDependentTokenizerAnnotator.java:105) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) ... Caused by: java.lang.NumberFormatException: For input string: "f" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.lang.Double.parseDouble(Double.java:538) -------- Thus, it looks like a string is being detected as a float. This had worked in version 4.0.1, so it must have been fixed at some point. Even after I made changes for the new NLM authentication for UMLS and tested it in 4.0.1 based on Peter's authentication solution, it stopped working after January 15th. Unfortunately, we're not set up to compile 4.0.1. That being said, does someone have a working version of 4.0.1 built from the trunk? If so, could you please send me a copy? If not, how can I find the offending file? This is kind of critical, since we're in the middle of an experiment and another side effect of reverting to 4.0.0.1 is it is a LOT slower than 4.0.1. Thanks very much in advance! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> Department of Surgery University of Minnesota g...@umn.edu