Re: Allergy Annotator

Ks Sunder Tue, 10 Jan 2017 22:29:42 -0800

Hi All,

my scenario is, read the string content from csv file, and find out medical
terms from that content using cTakes UML.


as per your suggestion i try to find CollectionReader in ctakes-core, but i
didnt get clear solution, please give valuable solution, and one example.


regards,
shyam k.

On Thu, Dec 22, 2016 at 9:16 PM, Finan, Sean <
[email protected]> wrote:

> Hi Shyam,
>
> I think that the key to your first question
> >   how can execute the single function to run all this jobs in short
> time...
> Is in your code here:
>
> 1       final JCas jcas = JCasFactory.createJCas();
> 2       jcas.setDocumentText( nextLine[0] );
> 3       SimplePipeline.runPipeline(jcas, getUMLPipeline());
>
> What you probably want to do is replace lines #1 and #2 with a
> CollectionReader, and then in #3 use a different SimplePipeline call that
> runs the pipeline using the CollectionReader instead of a static cas.
>
> There are commonly used CollectionReaders in ctakes-core.  The most widely
> applicable is probably the FileTreeReader*, which reads a tree of ascii
> files.  If you have some other source of text data then look around the
> code for something that might fit and let the devlist know if you can't
> find anything that fits your needs.
>
> I don't understand your second question:
> > how can i find sentence vised Dictionary words from string, give me a
> solution for this..
> Can you rephrase it and post to the devlist again?
>
> * one advantage that the FileTreeReader has is that it stores metadata on
> the input file tree placement, which can then be reproduced by output file
> writers like the html writer.
>
> Sean
>
>
> -----Original Message-----
> From: Ks Sunder [mailto:[email protected]]
> Sent: Thursday, December 22, 2016 2:33 AM
> To: [email protected]
> Subject: Re: Allergy Annotator
>
> Hi All,
>
> I have done the below code for finding medical terms from String
> information.
>
> step 1 :
> public static AnalysisEngineDescription getUMLPipeline() throws
> ResourceInitializationException, URISyntaxException{
>    AggregateBuilder builder = new AggregateBuilder();
>    builder.add(SimpleSegmentAnnotator.createAnnotatorDescription());
>    builder.add(SentenceDetector.createAnnotatorDescription());
>    builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription());
>    builder.add(POSTagger.createAnnotatorDescription());
>    builder.add(ClinicalPipelineFactory.getNpChunkerPipeline());
>    builder.add(LvgAnnotator.createAnnotatorDescription());
>
>      try {
>          builder.add( AnalysisEngineFactory.createEngineDescription(
> DefaultJCasTermAnnotator.class,
>               AbstractJCasTermAnnotator.PARAM_WINDOW_ANNOT_PRP,
>               "org.apache.ctakes.typesystem.type.textspan.Sentence",
>               JCasTermAnnotator.DICTIONARY_DESCRIPTOR_KEY,
>               ExternalResourceFactory.createExternalResourceDescription(
>                     FileResourceImpl.class,
>                     FileLocator.locateFile( 
> "org/apache/ctakes/dictionary/lookup/fast/cTakesHsql.xml"
> ) )
>         ) );
>      } catch ( FileNotFoundException e ) {
>         e.printStackTrace();
>         throw new ResourceInitializationException( e );
>      }
>
>    return builder.createAggregateDescription();
>  }
> step 2:
>
> final JCas jcas = JCasFactory.createJCas(); jcas.setDocumentText(
> nextLine[0] ); SimplePipeline.runPipeline(jcas, getUMLPipeline());
>
> for ( IdentifiedAnnotation entity : JCasUtil.select( jcas,
> IdentifiedAnnotation.class ) ) {
>
>          if(entity.getOntologyConceptArr() != null){
>
>         add.append(entity.getCoveredText()+ ",");
>
>          }
> }
>
>
>
>
>
> its working Fine..
>
> But i have two quires..
>
> 1. step1 , i am using Annotator step by step ... that time its taking more
> time load the all fuctions
>    how can execute the single function to run all this jobs in short
> time...
>
> 2. how can i find sentence vised Dictionary words from string, give me a
> solution for this..
>
>
> ...please give me a solutions for this issues....
>
>
>
> regards,
> shyam k.
>
> On Thu, Dec 8, 2016 at 1:59 AM, Mullane, Sean *HS <
> [email protected]> wrote:
>
> > I'm reviving this thread with reference to negation detection. I
> > previously posted about this to the User list but this is probably a
> > more appropriate venue.
> >
> > The way the sentences are split on ":" makes the negation annotator
> > miss negation in lists of this form:
> >
> > Hyperlipidemia:  Yes
> > Hypercholesterolemia:  No
> > Chronic Renal Insufficiency:  N/A
> >
> > I tried reversing order and removing ":"s and found that the negation
> > for Hypercholesterolemia is detected when in this form:
> >
> > Yes Hyperlipidemia
> > No Hypercholesterolemia
> > N/A Chronic Renal Insufficiency
> >
> > Our notes have quite a few places with this sort of list where good
> > negation detection is important but I haven't very good results. The
> > sentence segmentator sees this as 12 separate sentences, but I would
> > think proper behavior would be to consider this as 6 sentences
> > (breaking sentences on line break but not on colons). I see previous
> > discussion on the list about the sentence segmentator breaking on
> > newlines but little regarding colons. I would think in most cases it
> > would be more useful not to break on ":". Or is there an overriding
> reason for the current behavior?
> > If changing the sentence segmentator isn't an option is there a
> > different way to configure the negation detection annotator that would
> > avoid this issue?
> >
> > Thanks,
> > Sean
> >
> >
> >
> > Hi,
> >
> > I am interested in the design decision of the sentence detector.
> >
> > Why does it split a sentence of the form "WORD1: WORD2 WORD3." into
> > two sentences "WORD1:" and "WORD2 WORD3."? Do other components of
> > cTAKES require such a sentence splitting?
> >
> > It would seem to me that it should remain one sentence. For example,
> > the smoking status detector has its own SentenceAdjuster that merges
> > some of such sentences back into one, because of this design.
> >
> > Thanks, Tomasz
> >
> > ________________________________________ From: Finan, Sean [
> > [email protected]] Sent: Friday, July 10, 2015 3:20 PM To:
> > [email protected] Subject: RE: Allergy Annotator
> >
> > Hi Tom,
> >
> > It is exactly because the sentence detector splits "KEY:" from "VALUE"
> > that I
> > didn't suggest using sentences. Instead, I would just iterate over the
> > whole cas collection of medication events and attempt to match allergy
> > phrases ("allergic to medication") with text the note spanning from
> > event.begin-15 to
> > event.end+15 or whatever window size you prefer.
> >
> > Sean
> >
> > -----Original Message----- From: Tom Devel [mailto:[email protected]]
> > Sent: Friday, July 10, 2015 4:12 PM To: [email protected] Subject:
> > Re: Allergy Annotator
> >
> > Sean and Dima, these are great suggestions, thanks so far.
> >
> > Sean, when looping over medication events as you say, I can see how it
> > is possible to take the textspan.Sentence of this MedicationMention,
> > and then do a regex check for the phrase structure as Dima said.
> >
> > But instead of textspan.Sentence, you mention "see any is included in
> > a phrase".
> > What cTAKES/UIMA class is related to this?
> >
> > Because if I would use textspan.Sentence, it would work for "The
> > patient is allergic to penicillin.", but cTAKES splits "ALLERGIES:
> PENICILLIN, WHEAT"
> > into two sentences, so that the MedicationMentions here would not be
> > in the same sentence as the word "ALLERGIES".
> >
> > Thanks again, Tom
> >
> > On Fri, Jul 10, 2015 at 2:12 PM, Finan, Sean <
> > [email protected]>
> > wrote:
> >
> > Hi Dima, Tom,
> >
> > I was thinking the same as Dima's first solution. Iterate through the
> > medication events and see any is included in a phrase as mentioned in
> > Tom's original email. Each phrase structure would have to be specified
> > beforehand. However, assigning appropriate CUIs would require having a
> > lookup table for each medication allergy. I think that would be the
> > simplest solution.
> >
> > Sean
> >
> > -----Original Message----- From: Dligach, Dmitriy [mailto:
> > [email protected]] Sent: Friday, July 10, 2015 2:50 PM To:
> > cTAKES Developer list Subject: Re: Allergy Annotator
> >
> > Hi Tom,
> >
> > If the patters are pretty simple, you could just add a few rules on
> > top of the cTAKES dictionary lookup output. Something of the kind
> > "allergic to <medication>" or "allergies: <medication1>,
> > <medication2>, <substance1>, ...".
> >
> > If these patterns are hard to express as rules, you should consider a
> > machine learning based sequence labeling route (e.g. something similar
> > to the cTAKES chunker).
> >
> > Dima
> >
> > -- Dmitriy (Dima) Dligach, Ph.D. Boston Children's Hospital and
> > Harvard Medical School (617) 651-0397
> >
> > On Jul 10, 2015, at 13:40, Tom Devel <[email protected]<mailto:
> > [email protected]>> wrote:
> >
> > Sean,
> >
> > It would be a wider net, such that if an allergy is mentioned in the
> > clinical note, this is captured in the corresponding
> > IdentifiedAnnotation (or alternatively, if the IdentifiedAnnotation
> > class should not be changed with a new attribute, in a separate
> > allergy annotation).
> >
> > This annotator would then have to of course run after the clinical
> > pipeline has run and discovered all IdentifiedAnnotations.
> >
> > I am familiar with writing UIMA/cTAKES annotators, but not sure how a
> > new ML method could be integrated here for detecting allergies. Do you
> > have any thoughts about how to approach this in general?
> >
> > Thanks, Tom
> >
> > On Fri, Jul 10, 2015 at 11:54 AM, Finan, Sean <
> > [email protected]<mailto:[email protected]
> > du>>
> > wrote:
> >
> > Hi Tom,
> >
> > Are you interested in catching all allergies or just a few specific
> > allergies for a study? If you are only concerned with a few then there
> > is a
> > (possibly) simple solution. If you are interested in throwing a wider
> > net then I think that a new module would need to be created; does
> > anybody reading this have an ML or regex style module?
> >
> > Sean
> >
> > -----Original Message----- From: Tom Devel [mailto:[email protected]]
> > Sent: Friday, July 10, 2015 12:42 PM To: [email protected]<mailto:
> > [email protected]> Subject: Allergy Annotator
> >
> > Hi,
> >
> > I would like to use/extend cTAKES to detect allergies.
> >
> > In the cTAKES publication (2010)
> >
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ncbi.nlm.nih.g
> > ov_pmc_articles_PMC2995668_&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZM
> > SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=ZApJmGKjz
> > vFfNco5rRFVwSIyxmg4MRsxakfuXHbMZME&s=mGWu0XBCJqG2MI5qPlwIpGbQL5IYe7t5E
> > WcvhPYW7Lo&e= there is the mention that: "Allergies to a given
> > medication are handled by setting the negation attribute of that
> > medication to 'is negated'."
> >
> > However, in a post here in 2014 (RE: Allergy Indication) it is said
> > that cTAKES does not have a module for allergy discovery.
> >
> > 1. What is the current status of allergy detection in cTAKES?
> >
> > 2. I did some testing, while cTAKES discovers concepts about allegies
> > ("wheat allergy" is found as C0949570), using "ALLERGIES: PENICILLIN,
> > WHEAT" or "The patient is allergic to penicillin." does not give
> > penicillin or wheat annotations allergy status.
> >
> > How would I go about detecting these allergy mentions?
> >
> > Thanks, Tom
> >
> >
>

Re: Allergy Annotator

Reply via email to