I am interested. Thank you 

Shyam Bhimani
Software Engineer


  

CONFIDENTIALITY NOTICE: The contents of this email message and any attachments 
are intended solely for the addressee(s) and may contain confidential and/or 
privileged information and may be legally protected from disclosure.

-----Original Message-----
From: Kean Kaufmann <k...@recordsone.com> 
Sent: Wednesday, May 19, 2021 2:08 PM
To: dev@ctakes.apache.org
Subject: Re: rule-based lookup for custom lexicon [EXTERNAL] [SUSPICIOUS]

** WARNING: This email originated from outside of Target RWE. **


>
> If anybody out there in the general community is interested, please 
> reply on this thread and maybe we can coordinate a single presentation time.


Yes please. Thanks, Sean and (other) Peter!

On Wed, May 19, 2021 at 3:42 PM Finan, Sean < sean.fi...@childrens.harvard.edu> 
wrote:

> Hi (other) Peter,
>
> Many thanks for jumping in on this!
>
> I would definitely be interested in seeing some examples, even though 
> I don't have any specific use case right now.
>
> I will ask a few local people and see if they are interested in an 
> informal video chat.  If anybody out there in the general community is 
> interested, please reply on this thread and maybe we can coordinate a 
> single presentation time.
>
> Cheers,
>
> Sean
> ________________________________________
> From: Peter Klügl <peter.klu...@averbis.com>
> Sent: Wednesday, May 19, 2021 3:33 PM
> To: dev@ctakes.apache.org
> Subject: Re: rule-based lookup for custom lexicon [EXTERNAL] 
> [SUSPICIOUS]
>
> * External Email - Caution *
>
>
> Hi all,
>
>
> if you are interested in UIMA Ruta and want to know more about it, you 
> can always ask on the UIMA user list or me directly (I am the creator 
> of UIMA Ruta). I can also prepare some slides and we can have an 
> informal video chat where I give an overview of Ruta.
>
>
> I am of course not objective here (for several reasons) but I think 
> UIMA Ruta could be really useful for cTAKES. It was originally 
> developed for segmenting and processing discharge letters and similar 
> clincial documents. Since then (>10 years), Ruta has always been 
> applied to clincial documents and is being deployed in production by 
> several companies. The language has some advantages and disadvantages 
> compared to other rule languages. In the context of cTAKES, the 
> direct/comprehensive support of UIMA and the IDE dev support are maybe 
> the most relevant advantages.
>
>
> I was thinking about creating some introductory examples for the 
> combination and usage of UIMA Ruta and cTAKES. If you have a good use 
> case, let me know.
>
>
> Best,
>
>
> (another) Peter
>
>
> Am 19.05.2021 um 14:30 schrieb Finan, Sean:
> > Hi all,
> > Correct.
> >
> > Tim  is correct in the sense that he is using a custom dictionary
> (custom synonyms, cuis, etc.) which kind of changes the "rules" of 
> what the standard dictionary lookup considers a valid term based upon 
> available tokens in the text.  There are other simple settings that 
> further qualify how the standard dictionary lookup accepts or discards 
> synonyms.
> >
> > I think that what Greg is asking about is something with introduced
> "logic" that can alter or remove terms already discovered by the 
> standard dictionary lookup.
> >
> > Peter and Kean both outline some custom annotators that they have
> created to use logic that can alter/add/remove terms discovered by the 
> standard dictionary lookup.  I do the same thing for different 
> projects and advise everybody that applies ctakes to specific domains do the 
> same.
> >
> > ctakes is a general purpose tool and results can definitely be 
> > improved
> when catered to a more narrow purpose.
> >
> > Back to Greg, I got the feeling that he might be interested in a 
> > more
> versatile annotator.  Introducing an engine that can utilize something 
> like ruta has several advantages:
> > 1.  You  can "easily" add complex rules in one place.
> > 2.  You can change rules external to code ...
> >   2a. the same pipeline can be catered to different projects without
> changing code in an annotator or creating a new annotator.
> >   2b.  An end user who knows nothing about ctakes can change a ruta
> script to fit their purposes.
> > 3. Rules are supported and documented by uima ruta, so you don't 
> > have to
> worry about that extra headache.
> > 4. Once Greg adds it to apache ctakes (right? ;^) everybody in the
> community can apply ruta rules to their project.
> >
> > When I looked at it a few years ago it was for reason 2b.  In the 
> > end we
> went for different annotators like Peter and Kean outlined and just 
> use piper file changes to satisfy #2 as that is definitely much easier.
> However, it doesn't benefit the community as a whole (#4).
> >
> > Cheers all, this is a great conversation!
> >
> > Sean
> >
> >
> >
> >
> > ________________________________________
> > From: Kean Kaufmann <k...@recordsone.com>
> > Sent: Wednesday, May 19, 2021 7:50 AM
> > To: dev@ctakes.apache.org
> > Subject: Re: rule-based lookup for custom lexicon [EXTERNAL] 
> > [SUSPICIOUS]
> >
> > * External Email - Caution *
> >
> >
> >> yes,  the line between "lookup" and rule execution is a little 
> >> blurry
> > sometimes.
> >
> > Sure is.  I blur it with a set of annotators that extend dictionary 
> > annotations based on words or annotations covered by the same Chunk, e.g.
> >
> > DiseaseDisorderMention + /screen(ing)?/i = ProcedureMention 
> > MedicationMention + /dependenc[ey]|addiction/i = 
> > DiseaseDisorderMention DiseaseDisorderMention + 
> > AnatomicalSiteMention in same Chunk = DiseaseDisorderMention 
> > ProcedureMention + AnatomicalSiteMention in same Chunk = 
> > ProcedureMention
> >
> > Higher recall than the regular UmlsLookupAnnotator; higher precision 
> > than the UmlsOverlapLookupAnnotator (which skips a specified number 
> > of tokens regardless of syntax).
> >
> > I've been wanting a more general framework to fit this into, and 
> > thinking it might be Ruta.
> > Thanks for the pointer to TokensRegex; I'll look at that as well.
> >
> >
> > On Tue, May 18, 2021 at 5:39 PM Peter Abramowitsch <
> pabramowit...@gmail.com>
> > wrote:
> >
> >> Hi All,  yes,  the line between "lookup" and rule execution is a little
> >> blurry sometimes.   Here's some more blurriness.
> >>
> >> I've done something related, adapting a UIMA tokens regex engine 
> >> for Ctakes.  You create a new type in the TypeSystem.  In my case it uses
> >> CONLLDEP Annotations as the tokens to reason over.   You can set up
> >> expressions (rules) that look like this.
> >> (Yes, this case is already covered in the dictionary, but it's an
> example)
> >>
> >> Matcher A:   (lemma=="be");
> >> Matcher B:   /partially|partly/;
> >> Matcher C:   /vaccinated/;
> >>
> >> Rule  vaccinated|CUI1234|SNOMED5678:  A? B?  C;
> >>
> >> You get the Annotation you've delegated to this task, with the 
> >> entity value  "vaccinated|1234|5678"  and the range which spanned 
> >> the tokens
> that
> >> caused the annotation rule to fire
> >>
> >> (See Stanford's Tokens Regex)
> >>
> >> Peter
> >>
> >>
> >> On Tue, May 18, 2021 at 1:29 PM Miller, Timothy < 
> >> timothy.mil...@childrens.harvard.edu> wrote:
> >>
> >>> But Sean, isn't what he's asking for essentially already 
> >>> implemented in cTAKES as the custom dictionary? I'm currently 
> >>> using that approach for
> my
> >>> covid container:
> >>>
> >>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Fgithub.com%2FMachine-Learning-for-Me
> dical-Language%2Fctakes-covid-container__%3B!!NZvER7FxgEiBAiR_!7ZopTIh
> XKalQFx0xET_yET0agN2ZT8JWoa0UyqGSrXa4w-h_9-tRCEeiS4pr6s2Y-T4elV3bYac%2
> 4&amp;data=04%7C01%7C%7C2c06b48172e64effe38208d91b01d138%7Cd09f6c4846d
> 241f380993e0f7df7a48e%7C1%7C0%7C637570516886398095%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C1000&amp;sdata=9sq3Mkcfzpq6ky5VxRTJYX5fg96K9jLQ84ZuAZtfkBw%3D&amp;
> reserved=0
> >>> Tim
> >>>
> >>> ________________________________________
> >>> From: Finan, Sean <sean.fi...@childrens.harvard.edu>
> >>> Sent: Tuesday, May 18, 2021 11:55 AM
> >>> To: dev@ctakes.apache.org
> >>> Cc: Himanshu Shekhar Sahoo
> >>> Subject: Re: rule-based lookup for custom lexicon [EXTERNAL]
> [SUSPICIOUS]
> >>>
> >>> * External Email - Caution *
> >>>
> >>>
> >>> Hi Greg,
> >>>
> >>> From 30,000 ft, I think that you would want to use the RutaEngine.
> >>>
> >>>
> >>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Fuima.apache.org%2Fd%2Fruta-current%2
> Ftools.ruta.book.html*ugr.tools.ruta.ae.basic__%3BIw!!NZvER7FxgEiBAiR_
> !6YH1mXOYKMXiRAvLt8yPYWLMMklVu7YuK7KW1hde-iOew4ufAIPpkFHnsJxSvv8r5GjWi
> ckztninUTU%24&amp;data=04%7C01%7C%7C2c06b48172e64effe38208d91b01d138%7
> Cd09f6c4846d241f380993e0f7df7a48e%7C1%7C0%7C637570516886398095%7CUnkno
> wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
> CJXVCI6Mn0%3D%7C1000&amp;sdata=NplkaaVc1VSAzprb2eKYEWDZyjlceT%2FIzx0X9
> Y23yco%3D&amp;reserved=0
> >>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Fjavadoc.io%2Fdoc%2Forg.apache.uima%2
> Fruta-core%2Flatest%2Forg%2Fapache%2Fuima%2Fruta%2Fengine%2FRutaEngine
> .html__%3B!!NZvER7FxgEiBAiR_!6YH1mXOYKMXiRAvLt8yPYWLMMklVu7YuK7KW1hde-
> iOew4ufAIPpkFHnsJxSvv8r5GjWickzI7QF5CI%24&amp;data=04%7C01%7C%7C2c06b4
> 8172e64effe38208d91b01d138%7Cd09f6c4846d241f380993e0f7df7a48e%7C1%7C0%
> 7C637570516886398095%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
> joiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wsLHHngunn8
> M%2B8IIJpCLuUeHEreCkFbJsYxN41%2FErrc%3D&amp;reserved=0
> >>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__http%3A%2F%2Fsvn.apache.org%2Frepos%2Fasf%2Fuima%2
> Fruta%2Ftrunk%2Fruta-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fuima%2F
> ruta%2Fengine%2FRutaEngine.java__%3B!!NZvER7FxgEiBAiR_!6YH1mXOYKMXiRAv
> Lt8yPYWLMMklVu7YuK7KW1hde-iOew4ufAIPpkFHnsJxSvv8r5GjWickzJJ96zT4%24&am
> p;data=04%7C01%7C%7C2c06b48172e64effe38208d91b01d138%7Cd09f6c4846d241f
> 380993e0f7df7a48e%7C1%7C0%7C637570516886398095%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 1000&amp;sdata=8e53AJqf9xK5ZKj%2BhKk7wy%2BzQSEcHybEe65SM7etn5I%3D&amp;
> reserved=0
> >>> That seems to be the actual analysis engine that loads and uses 
> >>> rules
> to
> >>> create annotations.
> >>> While you could use an xml descriptor or use the piper "set" 
> >>> command
> and
> >>> do things like mapping ruta to ctakes type systems, I would take 
> >>> the alternate approach of "copying" the initialize(..) and process 
> >>> (..)
> >> methods
> >>> and modify them to use ctakes types directly.
> >>>
> >>> Disclaimer:  I know very little about uima ruta.  At some point I 
> >>> did
> >> look
> >>> into it but it was for a specific (ctakes-derivative) project and 
> >>> I
> >> didn't
> >>> go further than basic doc perusal.
> >>>
> >>> If you move forward with this please let us all know what you 
> >>> find.  I think that there will be great interest in the community.
> >>>
> >>> Sean
> >>> ________________________________________
> >>> From: Greg Silverman <g...@umn.edu.INVALID>
> >>> Sent: Tuesday, May 18, 2021 11:13 AM
> >>> To: dev@ctakes.apache.org
> >>> Cc: Himanshu Shekhar Sahoo
> >>> Subject: Re: rule-based lookup for custom lexicon [EXTERNAL]
> >>>
> >>> * External Email - Caution *
> >>>
> >>>
> >>> Hi Sean,
> >>> I was wondering if there was a way to use rule-base lookup of a 
> >>> custom lexicon within cTAKES (say a locally curated list of covd-19 
> >>> symptoms).
> >>> When I Googled around, I stumbled on UIMA Ruta, but couldn't find
> >> anything
> >>> wrt to cTAKES specifics.
> >>>
> >>> Thanks!
> >>>
> >>>
> >>> Greg--
> >>>
> >>> On Tue, May 18, 2021 at 10:04 AM Finan, Sean < 
> >>> sean.fi...@childrens.harvard.edu> wrote:
> >>>
> >>>>  To which ctakes component(s) are you referring?
> >>>> ________________________________________
> >>>> From: Greg Silverman <g...@umn.edu.INVALID>
> >>>> Sent: Sunday, May 16, 2021 6:02 PM
> >>>> To: dev@ctakes.apache.org; Himanshu Shekhar Sahoo
> >>>> Subject: rule-based lookup for custom lexicon [EXTERNAL]
> >>>>
> >>>> * External Email - Caution *
> >>>>
> >>>>
> >>>> I looked all over and could not find any information on how to 
> >>>> add
> this
> >>>> pipeline component to cTAKES. I assume it uses UIMA Ruta?
> >>>>
> >>>> Thanks in advance!
> >>>>
> >>>> Greg--
> >>>> --
> >>>> Greg M. Silverman
> >>>> Senior Systems Developer
> >>>> NLP/IE <
> >>>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Fhealthinformatics.umn.edu%2Fresearch
> %2Fnlpie-group__%3B!!NZvER7FxgEiBAiR_!6hN356eDesvWNYzsrDMaXgF6IkZw313Q
> U2QUQw5M8Jysvh1K1JxjEBeztZicX1DM2jC0o7_0qAA%24&amp;data=04%7C01%7C%7C2
> c06b48172e64effe38208d91b01d138%7Cd09f6c4846d241f380993e0f7df7a48e%7C1
> %7C0%7C637570516886398095%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0WN0yw
> j9IqYGirnJL2cF4EhcJCyqLR2E6gjrGH8r%2BPo%3D&amp;reserved=0
> >>>> Department of Surgery
> >>>> University of Minnesota
> >>>> g...@umn.edu
> >>>>
> >>>
> >>> --
> >>> Greg M. Silverman
> >>> Senior Systems Developer
> >>> NLP/IE <
> >>>
> >>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Fhealthinformatics.umn.edu%2Fresearch
> %2Fnlpie-group__%3B!!NZvER7FxgEiBAiR_!8uKf_4SXyKdCmvlMHvRGddxlzofg64D4
> _zsPdCThqeMAyn2akyMNI8wqM6yNUZA2N93F-aAsR7I%24&amp;data=04%7C01%7C%7C2
> c06b48172e64effe38208d91b01d138%7Cd09f6c4846d241f380993e0f7df7a48e%7C1
> %7C0%7C637570516886408094%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=aUEAqH
> Dqep4MURX9a5ZXabQ4W1LzM89AEPNHTqzG1Yw%3D&amp;reserved=0
> >>> Department of Surgery
> >>> University of Minnesota
> >>> g...@umn.edu
> >>>
> --
> Dr. Peter Klügl
> Head of Text Mining/Machine Learning
>
> Averbis GmbH
> Salzstr. 15
> 79098 Freiburg
> Germany
>
> Fon: +49 761 708 394 0
> Fax: +49 761 708 394 10
> Email: peter.klu...@averbis.com
> Web:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.com%2Fv3%2F__https%3A%2F%2Faverbis.com__%3B!!NZvER7FxgEiBAiR_!8
> k8JQUqNQYj-fQWELRFtxlACk1xSqLtVEnIHDmvmw6QnGtc3id_S4IOLqa6-Y9F4mOzpTfA
> OWo4%24&amp;data=04%7C01%7C%7C2c06b48172e64effe38208d91b01d138%7Cd09f6
> c4846d241f380993e0f7df7a48e%7C1%7C0%7C637570516886408094%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C1000&amp;sdata=EQcNZBDQoEHOCGnJRWPyz%2B2a8tulfifkkFGI1Py4SIs
> %3D&amp;reserved=0
>
> Headquarters: Freiburg im Breisgau
> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080 Managing 
> Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
>
>

Reply via email to