RE: Question on ctakes

Finan, Sean Sun, 15 Jan 2017 10:56:03 -0800

Hi Vighnesh,

> I Have changed JCasUtil.select ....
a -> new DefaultTextSpan(a, 0)
should not be changed.  The DefaultTextSpan in core.cc.pretty.textspan should 
be used as it has the overlaps(..) convenience method.  That and the similar 
class in lookup2 should be merged at some point ...
The constructor you are using:
   /**
    * @param annotation     -
    * @param sentenceOffset begin span offset of the containing sentence
    */
   public DefaultTextSpan( final AnnotationFS annotation, final int 
sentenceOffset ) {
      this( annotation.getBegin() - sentenceOffset, annotation.getEnd() - 
sentenceOffset );
   }


> I could not make out what needs to be added in place of BaseToken ...
TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
 should be 
TextSpan ts = new DefaultTextSpan( bt, 0 );

My apologies if you spent a lot of time debugging - I try to get the details 
into these emails but don't really have time to write and run everything 
myself.  It might help people on the devlist (or other forums) to state a 
little about your development experience / background if you need help.

I hope that this fixes everything,
Sean





-----Original Message-----
From: Sparsh K [mailto:sparsh...@gmail.com] 
Sent: Sunday, January 15, 2017 12:26 PM
To: dev@ctakes.apache.org
Cc: dev-...@ctakes.apache.org
Subject: Re: Question on ctakes

Hi Sean,

I tried to use your solution,

I got few compilation errors , I few fixed few.

I Have changed JCasUtil.select( jcas, IdentifiedAnnotation.class 
).stream().map( a -> new DefaultTextSpan(a, 0) )

to JCasUtil.select( jcas, IdentifiedAnnotation.class ).stream().map( a -> new 
DefaultTextSpan(*a.getBegin()*, 0) ) hope this is correct.


 I could not make out what needs to be added in place of BaseToken in below 
case.

 TextSpan ts = new DefaultTextSpan( BaseToken, 0 );


Thanks & Regards
Vighnesh




On Thu, Jan 12, 2017 at 10:12 PM, Sparsh K <sparsh...@gmail.com> wrote:

> Thanks for clarification sean.
>
> On Thu, Jan 12, 2017 at 8:43 PM, Finan, Sean < 
> sean.fi...@childrens.harvard.edu> wrote:
>
>> Hi Vighnesh,
>>
>> 1.  Does ctakes depend upon exact word match?
>>         By default, yet.  The fast clinical pipeline uses 
>> "DefaultJCasTermAnnotator" or some such horribly named class.  There 
>> is also an "OverlapJCasTermAnnotator".  Equally horrible name, 
>> slightly different functionality.  Given: "Blood, urine test" the 
>> Default will identify "blood", "urine" and "urine test".  The overlap 
>> will identify "Blood", "urine", "urine test" and "blood test".  
>> Obviously this requires all four terms to be in the dictionary.
>>
>> 2.  How to get all nouns in a document not covered by an 
>> IdentifiedAnnotation?
>>
>> JCasUtil.select( jcas, BaseToken.class ).stream().filter( b ->
>> b.getPartOfSpeech().equals("NN") ).map( Annotation::getCoveredText() 
>> ).forEach( System.out::println );
>>
>> Something like that should work.  Filtering by discovered 
>> IdentifiedAnnotations is another step.  Something like:
>>
>> Collection<TextSpan> identifiedSpans = JCasUtil.select( jcas, 
>> IdentifiedAnnotation.class ).stream().map( a -> new 
>> DefaultTextSpan(a, 0) ).collect( Collectors.toList() );
>>
>> Predicate<BaseToken> overlapped = bt -> {
>>    TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
>>    return identifiedSpans.stream().filter( s -> s.overlaps(ts) 
>> ).findAny().exists(); }
>>
>> Then add .filter( !overlapped ) before the original .map( 
>> Annotation::getCoveredText ).  I am not debugging this email, so you 
>> may need to check my stream methods.
>>
>> Sean
>>
>>
>> -----Original Message-----
>> From: Sparsh K [mailto:sparsh...@gmail.com]
>> Sent: Thursday, January 12, 2017 7:31 AM
>> To: dev-...@ctakes.apache.org; dev@ctakes.apache.org
>> Subject: Question on ctakes
>>
>> Hi
>>
>> I am new to ctakes, I have got few questions, Please guide me with 
>> your inputs.
>>
>> 1. When a clinical note is inputted to ctakes, it will process that 
>> text in multi stages.
>> Let us take an eg of a clinical note :- SINGLE/PRETERM (35 WEEKS 5 
>> DAYS)/MALE/AGA.
>>
>> Here the word "preterm" is not in dictionary, preterm infant, 
>> premature baby etc is there. So ctakes is not identifying that word as 
>> coveredText.
>>
>> My question is does ctakes processing mainly depends on exact word 
>> match with the dictionary.  If so If i give one page of clinical note 
>> with explanation of disease and if it does not contain exact matching 
>> words with dictionary, then ctakes will not identify that word. Is it true?
>>
>> 2. Ctakes does POS tagging and does named entity recognition on the 
>> noun terms. How to  pull out a list of nouns created which are not 
>> matched to a named disorder code at the named entity recognition level.
>>
>>
>> Regards
>> Vighnesh
>>
>
>

RE: Question on ctakes

Reply via email to