> yes, the line between "lookup" and rule execution is a little blurry sometimes.
Sure is. I blur it with a set of annotators that extend dictionary annotations based on words or annotations covered by the same Chunk, e.g. DiseaseDisorderMention + /screen(ing)?/i = ProcedureMention MedicationMention + /dependenc[ey]|addiction/i = DiseaseDisorderMention DiseaseDisorderMention + AnatomicalSiteMention in same Chunk = DiseaseDisorderMention ProcedureMention + AnatomicalSiteMention in same Chunk = ProcedureMention Higher recall than the regular UmlsLookupAnnotator; higher precision than the UmlsOverlapLookupAnnotator (which skips a specified number of tokens regardless of syntax). I've been wanting a more general framework to fit this into, and thinking it might be Ruta. Thanks for the pointer to TokensRegex; I'll look at that as well. On Tue, May 18, 2021 at 5:39 PM Peter Abramowitsch <pabramowit...@gmail.com> wrote: > Hi All, yes, the line between "lookup" and rule execution is a little > blurry sometimes. Here's some more blurriness. > > I've done something related, adapting a UIMA tokens regex engine for > Ctakes. You create a new type in the TypeSystem. In my case it uses > CONLLDEP Annotations as the tokens to reason over. You can set up > expressions (rules) that look like this. > (Yes, this case is already covered in the dictionary, but it's an example) > > Matcher A: (lemma=="be"); > Matcher B: /partially|partly/; > Matcher C: /vaccinated/; > > Rule vaccinated|CUI1234|SNOMED5678: A? B? C; > > You get the Annotation you've delegated to this task, with the entity > value "vaccinated|1234|5678" and the range which spanned the tokens that > caused the annotation rule to fire > > (See Stanford's Tokens Regex) > > Peter > > > On Tue, May 18, 2021 at 1:29 PM Miller, Timothy < > timothy.mil...@childrens.harvard.edu> wrote: > > > But Sean, isn't what he's asking for essentially already implemented in > > cTAKES as the custom dictionary? I'm currently using that approach for my > > covid container: > > > > > https://github.com/Machine-Learning-for-Medical-Language/ctakes-covid-container > > Tim > > > > ________________________________________ > > From: Finan, Sean <sean.fi...@childrens.harvard.edu> > > Sent: Tuesday, May 18, 2021 11:55 AM > > To: dev@ctakes.apache.org > > Cc: Himanshu Shekhar Sahoo > > Subject: Re: rule-based lookup for custom lexicon [EXTERNAL] [SUSPICIOUS] > > > > * External Email - Caution * > > > > > > Hi Greg, > > > > From 30,000 ft, I think that you would want to use the RutaEngine. > > > > > > > https://urldefense.com/v3/__https://uima.apache.org/d/ruta-current/tools.ruta.book.html*ugr.tools.ruta.ae.basic__;Iw!!NZvER7FxgEiBAiR_!6YH1mXOYKMXiRAvLt8yPYWLMMklVu7YuK7KW1hde-iOew4ufAIPpkFHnsJxSvv8r5GjWickztninUTU$ > > > > > https://urldefense.com/v3/__https://javadoc.io/doc/org.apache.uima/ruta-core/latest/org/apache/uima/ruta/engine/RutaEngine.html__;!!NZvER7FxgEiBAiR_!6YH1mXOYKMXiRAvLt8yPYWLMMklVu7YuK7KW1hde-iOew4ufAIPpkFHnsJxSvv8r5GjWickzI7QF5CI$ > > > > > https://urldefense.com/v3/__http://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/main/java/org/apache/uima/ruta/engine/RutaEngine.java__;!!NZvER7FxgEiBAiR_!6YH1mXOYKMXiRAvLt8yPYWLMMklVu7YuK7KW1hde-iOew4ufAIPpkFHnsJxSvv8r5GjWickzJJ96zT4$ > > > > That seems to be the actual analysis engine that loads and uses rules to > > create annotations. > > While you could use an xml descriptor or use the piper "set" command and > > do things like mapping ruta to ctakes type systems, I would take the > > alternate approach of "copying" the initialize(..) and process (..) > methods > > and modify them to use ctakes types directly. > > > > Disclaimer: I know very little about uima ruta. At some point I did > look > > into it but it was for a specific (ctakes-derivative) project and I > didn't > > go further than basic doc perusal. > > > > If you move forward with this please let us all know what you find. I > > think that there will be great interest in the community. > > > > Sean > > ________________________________________ > > From: Greg Silverman <g...@umn.edu.INVALID> > > Sent: Tuesday, May 18, 2021 11:13 AM > > To: dev@ctakes.apache.org > > Cc: Himanshu Shekhar Sahoo > > Subject: Re: rule-based lookup for custom lexicon [EXTERNAL] > > > > * External Email - Caution * > > > > > > Hi Sean, > > I was wondering if there was a way to use rule-base lookup of a custom > > lexicon within cTAKES (say a locally curated list of covd-19 symptoms). > > When I Googled around, I stumbled on UIMA Ruta, but couldn't find > anything > > wrt to cTAKES specifics. > > > > Thanks! > > > > > > Greg-- > > > > On Tue, May 18, 2021 at 10:04 AM Finan, Sean < > > sean.fi...@childrens.harvard.edu> wrote: > > > > > To which ctakes component(s) are you referring? > > > ________________________________________ > > > From: Greg Silverman <g...@umn.edu.INVALID> > > > Sent: Sunday, May 16, 2021 6:02 PM > > > To: dev@ctakes.apache.org; Himanshu Shekhar Sahoo > > > Subject: rule-based lookup for custom lexicon [EXTERNAL] > > > > > > * External Email - Caution * > > > > > > > > > I looked all over and could not find any information on how to add this > > > pipeline component to cTAKES. I assume it uses UIMA Ruta? > > > > > > Thanks in advance! > > > > > > Greg-- > > > -- > > > Greg M. Silverman > > > Senior Systems Developer > > > NLP/IE < > > > > > > https://urldefense.com/v3/__https://healthinformatics.umn.edu/research/nlpie-group__;!!NZvER7FxgEiBAiR_!6hN356eDesvWNYzsrDMaXgF6IkZw313QU2QUQw5M8Jysvh1K1JxjEBeztZicX1DM2jC0o7_0qAA$ > > > > > > > Department of Surgery > > > University of Minnesota > > > g...@umn.edu > > > > > > > > > -- > > Greg M. Silverman > > Senior Systems Developer > > NLP/IE < > > > https://urldefense.com/v3/__https://healthinformatics.umn.edu/research/nlpie-group__;!!NZvER7FxgEiBAiR_!8uKf_4SXyKdCmvlMHvRGddxlzofg64D4_zsPdCThqeMAyn2akyMNI8wqM6yNUZA2N93F-aAsR7I$ > > > > > Department of Surgery > > University of Minnesota > > g...@umn.edu > > >