Hi Peter, There are untold miles of code that I have never traveled ...
I don't understand the -1. Maybe the original code used some other code ( .start() )that wasn't zero-based? It seems like nec.begin should not be offset and that down the line consumers should offset to n-1 or .max(n-1,0). Which of course means that any fixes need to have propagated adjustments. Yay. I think that your ICLA has gone through (congratulations). Just in time, right? Sean ________________________________________ From: Peter Abramowitsch <pabramowit...@gmail.com> Sent: Monday, October 12, 2020 6:05 PM To: dev@ctakes.apache.org Subject: Found code relating to a bug I reported a few weeks ago. [EXTERNAL] * External Email - Caution * Hi Sean If you know every inch of the code maybe I can ask you what you think of this problem I found in the negex annotator. It causes any sentence to crash when the very first character begins the negation: *"Absence of headache" * causes a crash later on in another annotator because the ContextAnnotation it creates has a begin offset of -1. *" Absence of headache" * successfully annotates the phrase. I need to fix this urgently, but I found a mysterious piece of code that is responsible for this. I'm working off a trunk snapshot 4.0.1 taken Dec 27 2018 NegexAnnotator.annotateNegation() at line 846 of its class file: *846: nec.setBegin(s.getBegin() + t.getStart() - 1);* In the case where a sentence begins with "Absence of...." then both s (Sentence) and t (negex token) begin at offset 0. Then the ContextAnnotation goes on its destructive way. So what's with the -1 ? Of course It also fails with "No headache..." as the beginning of a sentence If you know, or have a hunch why the -1 at that line is there I will track it down further. Otherwise I'm just tempted to leave it and Max(calcOffset, 0) The crash actually occurs as control passes to the historyOf annotator which looks at the ContextAnnotation created by Negex. See below the signature for the stack trace My ICLA has not been approved yet so I can't make any alterations to source, nor would I without any orientation to the process. Never done it before nor have the keys to Jira Anyone else who knows the NegexNegator in depth please chime in as well Peter -------------------------------- Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1960) at org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:128) at org.apache.ctakes.dependency.parser.util.DependencyUtility.doesSubsume(DependencyUtility.java:67) at org.apache.ctakes.dependency.parser.util.DependencyUtility.getDependencyNodes(DependencyUtility.java:104) at org.apache.ctakes.dependency.parser.util.DependencyUtility.getNominalHeadNode(DependencyUtility.java:113) at org.apache.ctakes.assertion.attributes.history.HistoryAttributeClassifier.extract(HistoryAttributeClassifier.java:2