CharacterOffsetToLineTokenConverterCtakesImpl.java

digital paula Tue, 14 Jan 2014 20:26:37 -0800




Hello cTAKES Developer Community,
 I'm a little behind on reading posts....this one is from last month.  I think 
this issue is already addressed in current release? I'm still running the 
previous release...3.1.0. 
I just noticed something interesting, the negation didn't take when it is on a 
different line.  I just removed all carriage returns from narratives and 
negation picked it up as long as it's treated as one long string.   To better 
explain what I mean.  Two narrative comments below.  
 
1.  patient did not have diabetes 
2. patient did not have 
diabetes
 
Number 1 above got negated but number 2 did not. This might be related to the 
issue w/the sectionizer.  I noticed that when I treated the narrative as one 
string the sectionizer never crashes with the NPE.   Well the sectionizer is of 
no point if narrative is as one string but it's helping me pinpoint the 
problem.  
 
Regards,
Paula

 
> Date: Thu, 19 Dec 2013 11:04:57 -0500
> Subject: Re: FW: svn commit: r1551805 - 
> /ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> From: vnga...@gmail.com
> To: dev@ctakes.apache.org
> 
> Hi Pei,
> 
> I'm not sure if that would solve the problem: change in the ytex branch
> causes newlines to be ignored (i.e. not treated as a token).  trunk's
> sentence splitter is splits sentences on newlines, so newlines would never
> be found in a sentence.  However, if we had a reproducer we could check it
> fairly easily in the ytex branch.
> 
> Best,
> 
> VJ
> 
> 
> On Thu, Dec 19, 2013 at 10:15 AM, Chen, Pei
> <pei.c...@childrens.harvard.edu>wrote:
> 
> > Vj,
> > Do you think this is what was causing the NPE's [1]?
> > If so, shall we make the same fix in trunk?
> > --Pei
> >
> > [1]
> > http://mail-archives.apache.org/mod_mbox/ctakes-dev/201309.mbox/%3C924DE05C19409B438EB81DE683A942D9105A93CB%40CHEXMBX1A.CHBOSTON.ORG%3E
> >
> > -----Original Message-----
> > From: vjapa...@apache.org [mailto:vjapa...@apache.org]
> > Sent: Tuesday, December 17, 2013 9:15 PM
> > To: comm...@ctakes.apache.org
> > Subject: svn commit: r1551805 -
> > /ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> >
> > Author: vjapache
> > Date: Wed Dec 18 02:14:13 2013
> > New Revision: 1551805
> >
> > URL: http://svn.apache.org/r1551805
> > Log:
> > add support for sentences that contain newline tokens.
> >
> > Modified:
> >
> > ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> >
> > Modified:
> > ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> > URL:
> > http://svn.apache.org/viewvc/ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java?rev=1551805&r1=1551804&r2=1551805&view=diff
> >
> > ==============================================================================
> > ---
> > ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> > (original)
> > +++ ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctake
> > +++ s/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCta
> > +++ kesImpl.java Wed Dec 18 02:14:13 2013
> > @@ -32,8 +32,8 @@ import org.apache.uima.jcas.tcas.Annotat  import
> > org.mitre.medfacts.i2b2.api.ApiConcept;
> >  import org.mitre.medfacts.zoner.CharacterOffsetToLineTokenConverter;
> >  import org.mitre.medfacts.zoner.LineAndTokenPosition;
> > -
> >  import org.apache.ctakes.typesystem.type.syntax.BaseToken;
> > +import org.apache.ctakes.typesystem.type.syntax.NewlineToken;
> >  import org.apache.ctakes.typesystem.type.textspan.Sentence;
> >
> >  public class CharacterOffsetToLineTokenConverterCtakesImpl implements
> > CharacterOffsetToLineTokenConverter
> > @@ -78,11 +78,13 @@ public class CharacterOffsetToLineTokenC
> >           for (Annotation current : annotationIndex)
> >           {
> >                   BaseToken bt = (BaseToken)current;
> > -                 int begin = bt.getBegin();
> > -                 int end = bt.getEnd();
> > -
> > -                 tokenBeginEndTreeSet.add(begin);
> > -                 tokenBeginEndTreeSet.add(end);
> > +                 // filter out NewlineToken
> > +                 if (!(bt instanceof NewlineToken)) {
> > +                         int begin = bt.getBegin();
> > +                         int end = bt.getEnd();
> > +                         tokenBeginEndTreeSet.add(begin);
> > +                         tokenBeginEndTreeSet.add(end);
> > +                 }
> >           }
> >    }
> >
> >
> >
> >
RE: svn commit: r1551805 - /ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java

Reply via email to