Hi Sean,

I tried to use your solution,

I got few compilation errors , I few fixed few.

I Have changed JCasUtil.select( jcas, IdentifiedAnnotation.class
).stream().map( a -> new DefaultTextSpan(a, 0) )

to JCasUtil.select( jcas, IdentifiedAnnotation.class ).stream().map( a ->
new DefaultTextSpan(*a.getBegin()*, 0) ) hope this is correct.


 I could not make out what needs to be added in place of BaseToken in below
case.

 TextSpan ts = new DefaultTextSpan( BaseToken, 0 );


Thanks & Regards
Vighnesh




On Thu, Jan 12, 2017 at 10:12 PM, Sparsh K <sparsh...@gmail.com> wrote:

> Thanks for clarification sean.
>
> On Thu, Jan 12, 2017 at 8:43 PM, Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
>
>> Hi Vighnesh,
>>
>> 1.  Does ctakes depend upon exact word match?
>>         By default, yet.  The fast clinical pipeline uses
>> "DefaultJCasTermAnnotator" or some such horribly named class.  There is
>> also an "OverlapJCasTermAnnotator".  Equally horrible name, slightly
>> different functionality.  Given: "Blood, urine test" the Default will
>> identify "blood", "urine" and "urine test".  The overlap will identify
>> "Blood", "urine", "urine test" and "blood test".  Obviously this requires
>> all four terms to be in the dictionary.
>>
>> 2.  How to get all nouns in a document not covered by an
>> IdentifiedAnnotation?
>>
>> JCasUtil.select( jcas, BaseToken.class ).stream().filter( b ->
>> b.getPartOfSpeech().equals("NN") ).map( Annotation::getCoveredText()
>> ).forEach( System.out::println );
>>
>> Something like that should work.  Filtering by discovered
>> IdentifiedAnnotations is another step.  Something like:
>>
>> Collection<TextSpan> identifiedSpans = JCasUtil.select( jcas,
>> IdentifiedAnnotation.class ).stream().map( a -> new DefaultTextSpan(a, 0)
>> ).collect( Collectors.toList() );
>>
>> Predicate<BaseToken> overlapped = bt -> {
>>    TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
>>    return identifiedSpans.stream().filter( s -> s.overlaps(ts)
>> ).findAny().exists();
>> }
>>
>> Then add .filter( !overlapped ) before the original .map(
>> Annotation::getCoveredText ).  I am not debugging this email, so you may
>> need to check my stream methods.
>>
>> Sean
>>
>>
>> -----Original Message-----
>> From: Sparsh K [mailto:sparsh...@gmail.com]
>> Sent: Thursday, January 12, 2017 7:31 AM
>> To: dev-...@ctakes.apache.org; dev@ctakes.apache.org
>> Subject: Question on ctakes
>>
>> Hi
>>
>> I am new to ctakes, I have got few questions, Please guide me with your
>> inputs.
>>
>> 1. When a clinical note is inputted to ctakes, it will process that text
>> in multi stages.
>> Let us take an eg of a clinical note :- SINGLE/PRETERM (35 WEEKS 5
>> DAYS)/MALE/AGA.
>>
>> Here the word "preterm" is not in dictionary, preterm infant, premature
>> baby etc is there. So ctakes is not identifying that word as coveredText.
>>
>> My question is does ctakes processing mainly depends on exact word match
>> with the dictionary.  If so If i give one page of clinical note with
>> explanation of disease and if it does not contain exact matching words with
>> dictionary, then ctakes will not identify that word. Is it true?
>>
>> 2. Ctakes does POS tagging and does named entity recognition on the noun
>> terms. How to  pull out a list of nouns created which are not matched to a
>> named disorder code at the named entity recognition level.
>>
>>
>> Regards
>> Vighnesh
>>
>
>

Reply via email to