Re: Question on ctakes

Sparsh K Sun, 15 Jan 2017 22:13:27 -0800

Sorry I have used wrong constructor for DefaultTextSpan previously. So
ended up with error.


Thank you so much Sean detailed reply.

On Mon, Jan 16, 2017 at 12:25 AM, Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Vighnesh,
>
> > I Have changed JCasUtil.select ....
> a -> new DefaultTextSpan(a, 0)
> should not be changed.  The DefaultTextSpan in core.cc.pretty.textspan
> should be used as it has the overlaps(..) convenience method.  That and the
> similar class in lookup2 should be merged at some point ...
> The constructor you are using:
>    /**
>     * @param annotation     -
>     * @param sentenceOffset begin span offset of the containing sentence
>     */
>    public DefaultTextSpan( final AnnotationFS annotation, final int
> sentenceOffset ) {
>       this( annotation.getBegin() - sentenceOffset, annotation.getEnd() -
> sentenceOffset );
>    }
>
> > I could not make out what needs to be added in place of BaseToken ...
> TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
>  should be
> TextSpan ts = new DefaultTextSpan( bt, 0 );
>
> My apologies if you spent a lot of time debugging - I try to get the
> details into these emails but don't really have time to write and run
> everything myself.  It might help people on the devlist (or other forums)
> to state a little about your development experience / background if you
> need help.
>
> I hope that this fixes everything,
> Sean
>
>
>
>
>
> -----Original Message-----
> From: Sparsh K [mailto:sparsh...@gmail.com]
> Sent: Sunday, January 15, 2017 12:26 PM
> To: dev@ctakes.apache.org
> Cc: dev-...@ctakes.apache.org
> Subject: Re: Question on ctakes
>
> Hi Sean,
>
> I tried to use your solution,
>
> I got few compilation errors , I few fixed few.
>
> I Have changed JCasUtil.select( jcas, IdentifiedAnnotation.class
> ).stream().map( a -> new DefaultTextSpan(a, 0) )
>
> to JCasUtil.select( jcas, IdentifiedAnnotation.class ).stream().map( a ->
> new DefaultTextSpan(*a.getBegin()*, 0) ) hope this is correct.
>
>
>  I could not make out what needs to be added in place of BaseToken in
> below case.
>
>  TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
>
>
> Thanks & Regards
> Vighnesh
>
>
>
>
> On Thu, Jan 12, 2017 at 10:12 PM, Sparsh K <sparsh...@gmail.com> wrote:
>
> > Thanks for clarification sean.
> >
> > On Thu, Jan 12, 2017 at 8:43 PM, Finan, Sean <
> > sean.fi...@childrens.harvard.edu> wrote:
> >
> >> Hi Vighnesh,
> >>
> >> 1.  Does ctakes depend upon exact word match?
> >>         By default, yet.  The fast clinical pipeline uses
> >> "DefaultJCasTermAnnotator" or some such horribly named class.  There
> >> is also an "OverlapJCasTermAnnotator".  Equally horrible name,
> >> slightly different functionality.  Given: "Blood, urine test" the
> >> Default will identify "blood", "urine" and "urine test".  The overlap
> >> will identify "Blood", "urine", "urine test" and "blood test".
> >> Obviously this requires all four terms to be in the dictionary.
> >>
> >> 2.  How to get all nouns in a document not covered by an
> >> IdentifiedAnnotation?
> >>
> >> JCasUtil.select( jcas, BaseToken.class ).stream().filter( b ->
> >> b.getPartOfSpeech().equals("NN") ).map( Annotation::getCoveredText()
> >> ).forEach( System.out::println );
> >>
> >> Something like that should work.  Filtering by discovered
> >> IdentifiedAnnotations is another step.  Something like:
> >>
> >> Collection<TextSpan> identifiedSpans = JCasUtil.select( jcas,
> >> IdentifiedAnnotation.class ).stream().map( a -> new
> >> DefaultTextSpan(a, 0) ).collect( Collectors.toList() );
> >>
> >> Predicate<BaseToken> overlapped = bt -> {
> >>    TextSpan ts = new DefaultTextSpan( BaseToken, 0 );
> >>    return identifiedSpans.stream().filter( s -> s.overlaps(ts)
> >> ).findAny().exists(); }
> >>
> >> Then add .filter( !overlapped ) before the original .map(
> >> Annotation::getCoveredText ).  I am not debugging this email, so you
> >> may need to check my stream methods.
> >>
> >> Sean
> >>
> >>
> >> -----Original Message-----
> >> From: Sparsh K [mailto:sparsh...@gmail.com]
> >> Sent: Thursday, January 12, 2017 7:31 AM
> >> To: dev-...@ctakes.apache.org; dev@ctakes.apache.org
> >> Subject: Question on ctakes
> >>
> >> Hi
> >>
> >> I am new to ctakes, I have got few questions, Please guide me with
> >> your inputs.
> >>
> >> 1. When a clinical note is inputted to ctakes, it will process that
> >> text in multi stages.
> >> Let us take an eg of a clinical note :- SINGLE/PRETERM (35 WEEKS 5
> >> DAYS)/MALE/AGA.
> >>
> >> Here the word "preterm" is not in dictionary, preterm infant,
> >> premature baby etc is there. So ctakes is not identifying that word as
> coveredText.
> >>
> >> My question is does ctakes processing mainly depends on exact word
> >> match with the dictionary.  If so If i give one page of clinical note
> >> with explanation of disease and if it does not contain exact matching
> >> words with dictionary, then ctakes will not identify that word. Is it
> true?
> >>
> >> 2. Ctakes does POS tagging and does named entity recognition on the
> >> noun terms. How to  pull out a list of nouns created which are not
> >> matched to a named disorder code at the named entity recognition level.
> >>
> >>
> >> Regards
> >> Vighnesh
> >>
> >
> >
>

Re: Question on ctakes

Reply via email to