Re: Found code relating to a bug I reported a few weeks ago. [EXTERNAL]

Peter Abramowitsch Thu, 15 Oct 2020 11:14:19 -0700

Thanks Sean

Yes, this morning I got access, but I'm not in a hurry to start tampering
with the archive.  If you want to take this into another mail stream it's
fine - perhaps better.     I have a couple of general questions and
specific questions


1.  About the fix we discussed above, are you suggesting we just let the
negex annotator begin at offset 0 + token-pos, and then see if anything
stops working (or improves!)  downstream, now that at least historyOf
downstream doesn't throw an exception anymore?   There are so many
downstream permutations given all the possible annotators it would be
impossible to test all of them.  And unless we see another one of these -1
strangenesses in other places where context annotations are created, can we
just assume that it is idiosyncratic to Negex for historical reasons?

2.  Is the official archive in Git now or in SVN?  Apache root mentioned
SVN only.   If SVN, what is your favorite gui if you use one?

3.  I prefer a peer-review/pull-request type interaction if possible.  I
would hate to introduce rubbish even if entitled to do so.  Do you already
implement something like that?

4.  What about Jira?  Do permissions and links come from Apache?  I've been
on it briefly in read-only mode.

There's no hurry to respond.  I'm on my way back to Italy shortly and will
set up shop there again next week sometime.

Regards, Peter

On Wed, Oct 14, 2020 at 2:12 PM Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Peter,
>
> There are untold miles of code that I have never traveled ...
>
> I don't understand the -1.  Maybe the original code used some other code (
> .start() )that wasn't zero-based?
>
> It seems like nec.begin should not be offset and that down the line
> consumers should offset to n-1 or .max(n-1,0).  Which of course means that
> any fixes need to have propagated adjustments.  Yay.
>
> I think that your ICLA has gone through (congratulations).  Just in time,
> right?
>
> Sean
>
>
> ________________________________________
> From: Peter Abramowitsch <pabramowit...@gmail.com>
> Sent: Monday, October 12, 2020 6:05 PM
> To: dev@ctakes.apache.org
> Subject: Found code relating to a bug I reported a few weeks ago.
> [EXTERNAL]
>
> * External Email - Caution *
>
>
> Hi Sean
>
> If you know every inch of the code maybe I can ask you what you think of
> this problem I found in the negex annotator.   It causes any sentence to
> crash when the very first character begins the negation:
>
> *"Absence of headache"  *
> causes a crash later on in another annotator because the ContextAnnotation
> it creates has a begin offset of -1.
>
> *" Absence of headache" *
> successfully annotates the phrase.
>
> I need to fix this urgently, but I found a mysterious piece of code that is
> responsible for this.
> I'm working off a trunk snapshot 4.0.1 taken Dec 27  2018
>
> NegexAnnotator.annotateNegation()   at line 846 of its class file:
>
> *846: nec.setBegin(s.getBegin() + t.getStart() - 1);*
>
> In the case where a sentence begins with "Absence of...."   then both s
> (Sentence) and t (negex token) begin at offset 0. Then the
> ContextAnnotation goes on its destructive way.       So what's with the
> -1   ?
> Of course It also fails with "No headache..." as the beginning of a
> sentence
>
> If you know, or have a hunch why the  -1 at that line is there I will track
> it down further.  Otherwise I'm just tempted to leave it and
> Max(calcOffset, 0)
>
> The crash actually occurs as control passes to the historyOf annotator
> which looks at the ContextAnnotation created by Negex.
> See below the signature for the stack trace
>
> My ICLA has not been approved  yet so I can't make any alterations to
> source, nor would I without any orientation to the process.   Never done it
> before nor have the keys to Jira
>
> Anyone else who knows the NegexNegator in depth  please chime in as well
>
> Peter
> --------------------------------
>
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
> at java.lang.String.substring(String.java:1960)
> at org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:128)
> at
>
> org.apache.ctakes.dependency.parser.util.DependencyUtility.doesSubsume(DependencyUtility.java:67)
> at
>
> org.apache.ctakes.dependency.parser.util.DependencyUtility.getDependencyNodes(DependencyUtility.java:104)
> at
>
> org.apache.ctakes.dependency.parser.util.DependencyUtility.getNominalHeadNode(DependencyUtility.java:113)
> at
>
> org.apache.ctakes.assertion.attributes.history.HistoryAttributeClassifier.extract(HistoryAttributeClassifier.java:2
>

Re: Found code relating to a bug I reported a few weeks ago. [EXTERNAL]

Reply via email to