Re: Testing the 5.0 version [EXTERNAL]
Hi Peter, That is great news. I sometime soon I will take a gander at ctakes and see if I can identify areas of importance or concern to me and what I might do to test them. However, don't think of that as being a definitive list. All, please take advantage of Peter's offer and share items that you would like to receive some attention. If anybody can, please work with Peter to help keep ctakes a top-notch application for clinical NLP. Cheers, Sean From: Peter Abramowitsch Sent: Monday, August 7, 2023 11:48 AM To: dev@ctakes.apache.org Subject: Testing the 5.0 version [EXTERNAL] * External Email - Caution * Hi Sean, looks like my funding for some experimentation with 5.0 is finally going to happen in a month or so. I'm going to be looking at all the new functionality (I'm back on a branch of 4.0.1 on a custom webservices platform), but is there any particular area of 5.0 that you'd like me to exercise? Peter
Initial CTakes analysis
I am looking for a NLP to read pathology reports and extract cancer related site, histology, stage and any other DX/RX data available. In looking at CTakes, I have a few questions; - Is CTakes an appropriate tool to automate this task? - The end goal would be a fully automated tool where text was presented to an API and data was returned. - An added bonus, would be for the tool to annotate the text, so that a reviewer can more easily find the relevant data. - For someone with a strong IT/software development background, but no NLP background what is the level of difficulty in getting started with this product? Paul R. Stearns Advanced Consulting Enterprises, Inc. 15150 NW 79th Court, Suite: 206 Miami Lakes Fl, 33016 Voice: (305)623-0360 x107 Fax: (305)623-4588
Re: Testing the 5.0 version [EXTERNAL]
Hi Sean, One area I could think of is to transform ctakes web rest module from traditional spring framework to spring boot framework which can enable the end users to bootstrap the REST API easily and test the same which could improve the overall adoption rate without major complexity. On Thu, 10 Aug 2023 at 20:07, Finan, Sean wrote: > Hi Peter, > > That is great news. I sometime soon I will take a gander at ctakes and > see if I can identify areas of importance or concern to me and what I might > do to test them. However, don't think of that as being a definitive list. > > All, please take advantage of Peter's offer and share items that you would > like to receive some attention. > > If anybody can, please work with Peter to help keep ctakes a top-notch > application for clinical NLP. > > Cheers, > > Sean > > > From: Peter Abramowitsch > Sent: Monday, August 7, 2023 11:48 AM > To: dev@ctakes.apache.org > Subject: Testing the 5.0 version [EXTERNAL] > > * External Email - Caution * > > > Hi Sean, looks like my funding for some experimentation with 5.0 is > finally going to happen in a month or so. I'm going to be looking at all > the new functionality (I'm back on a branch of 4.0.1 on a custom > webservices platform), but is there any particular area of 5.0 that you'd > like me to exercise? > > Peter > -- Regards, Gandhi "The best way to find urself is to lose urself in the service of others !!!"
Re: Testing the 5.0 version [EXTERNAL]
Hi Gandhi, That is a great idea! I would like to put off adding new functionality until 5.0 is released. I am hoping that we can release what is in the GitHub repo as it is right now, save for bug fixes. I will try to keep your spring boot idea on my radar for a version 6 upgrade. Would you be able to help with that? Thanks, Sean From: gandhi rajan Sent: Thursday, August 10, 2023 1:20 PM To: dev@ctakes.apache.org Subject: Re: Testing the 5.0 version [EXTERNAL] * External Email - Caution * Hi Sean, One area I could think of is to transform ctakes web rest module from traditional spring framework to spring boot framework which can enable the end users to bootstrap the REST API easily and test the same which could improve the overall adoption rate without major complexity. On Thu, 10 Aug 2023 at 20:07, Finan, Sean wrote: > Hi Peter, > > That is great news. I sometime soon I will take a gander at ctakes and > see if I can identify areas of importance or concern to me and what I might > do to test them. However, don't think of that as being a definitive list. > > All, please take advantage of Peter's offer and share items that you would > like to receive some attention. > > If anybody can, please work with Peter to help keep ctakes a top-notch > application for clinical NLP. > > Cheers, > > Sean > > > From: Peter Abramowitsch > Sent: Monday, August 7, 2023 11:48 AM > To: dev@ctakes.apache.org > Subject: Testing the 5.0 version [EXTERNAL] > > * External Email - Caution * > > > Hi Sean, looks like my funding for some experimentation with 5.0 is > finally going to happen in a month or so. I'm going to be looking at all > the new functionality (I'm back on a branch of 4.0.1 on a custom > webservices platform), but is there any particular area of 5.0 that you'd > like me to exercise? > > Peter > -- Regards, Gandhi "The best way to find urself is to lose urself in the service of others !!!"
Re: Testing the 5.0 version [EXTERNAL]
Hi Sean and everyone, I'm happy to receive suggestions from others, but because this is funded by a client, I will eventually have to put more effort on areas that are of interest to them. Also it is a limited engagement, so I can't guarantee how thorough my testing will be - but I'll certainly report back what I find. I'll probably begin this work towards the middle of September. One area I probably will not test is the web REST module as we already have a REST framework that we'll be continuing to use that is lightweight, tailored to our production and security needs, and whose performance under heavy load is well understood. It is based on a SparkJava-Jetty foundation rather than Spring. So maybe someone else can test the official version. regards Peter On Thu, Aug 10, 2023 at 7:37 AM Finan, Sean wrote: > Hi Peter, > > That is great news. I sometime soon I will take a gander at ctakes and > see if I can identify areas of importance or concern to me and what I might > do to test them. However, don't think of that as being a definitive list. > > All, please take advantage of Peter's offer and share items that you would > like to receive some attention. > > If anybody can, please work with Peter to help keep ctakes a top-notch > application for clinical NLP. > > Cheers, > > Sean > > > From: Peter Abramowitsch > Sent: Monday, August 7, 2023 11:48 AM > To: dev@ctakes.apache.org > Subject: Testing the 5.0 version [EXTERNAL] > > * External Email - Caution * > > > Hi Sean, looks like my funding for some experimentation with 5.0 is > finally going to happen in a month or so. I'm going to be looking at all > the new functionality (I'm back on a branch of 4.0.1 on a custom > webservices platform), but is there any particular area of 5.0 that you'd > like me to exercise? > > Peter >
Re: Testing the 5.0 version [EXTERNAL]
Hi Sean, I can surely help out on that with some guidance and support from additional technical hands. On Thu, 10 Aug 2023 at 22:59, Finan, Sean wrote: > Hi Gandhi, > > That is a great idea! > > I would like to put off adding new functionality until 5.0 is released. > I am hoping that we can release what is in the GitHub repo as it is right > now, save for bug fixes. > > I will try to keep your spring boot idea on my radar for a version 6 > upgrade. Would you be able to help with that? > > Thanks, > Sean > > > From: gandhi rajan > Sent: Thursday, August 10, 2023 1:20 PM > To: dev@ctakes.apache.org > Subject: Re: Testing the 5.0 version [EXTERNAL] > > * External Email - Caution * > > > Hi Sean, > > One area I could think of is to transform ctakes web rest module from > traditional spring framework to spring boot framework which can enable the > end users to bootstrap the REST API easily and test the same which could > improve the overall adoption rate without major complexity. > > On Thu, 10 Aug 2023 at 20:07, Finan, Sean > wrote: > > > Hi Peter, > > > > That is great news. I sometime soon I will take a gander at ctakes and > > see if I can identify areas of importance or concern to me and what I > might > > do to test them. However, don't think of that as being a definitive > list. > > > > All, please take advantage of Peter's offer and share items that you > would > > like to receive some attention. > > > > If anybody can, please work with Peter to help keep ctakes a top-notch > > application for clinical NLP. > > > > Cheers, > > > > Sean > > > > > > From: Peter Abramowitsch > > Sent: Monday, August 7, 2023 11:48 AM > > To: dev@ctakes.apache.org > > Subject: Testing the 5.0 version [EXTERNAL] > > > > * External Email - Caution * > > > > > > Hi Sean, looks like my funding for some experimentation with 5.0 is > > finally going to happen in a month or so. I'm going to be looking at all > > the new functionality (I'm back on a branch of 4.0.1 on a custom > > webservices platform), but is there any particular area of 5.0 that > you'd > > like me to exercise? > > > > Peter > > > > > -- > Regards, > Gandhi > > "The best way to find urself is to lose urself in the service of others > !!!" > -- Regards, Gandhi "The best way to find urself is to lose urself in the service of others !!!"
Fwd: Initial CTakes analysis
Hi Paul Out of the box, cTakes would get you part of the way there, but would require several types of customization to meet your requirements. All of these are the kind of customizations that most of us have had to do, so there's nothing new here, but they are not trivial. As I see it they fall into these categories. 1. getting familiar with the cTakes Application, pipeline, annotator and vocabulary ecosystem 2. choosing a vocabulary subset that gives the best coverage of the terms you are looking for 3. adding one or more custom dictionaries to add terms & synonyms that are not present - 4. maybe employing the anatomical site annotator in your pipeline 5. deciding how to harvest and structure the data you extract from the CAS object which all the annotators target 6. decide how to deploy the application (standalone?, webservices host? multi-instance? ). Many considerations go into this and greatly affect ability to scale. There is more than one architectural solution that will work and allow you to get to your "fully automated" goal, but you will need to implement that yourself. A hint about highlighting the text - all annotations carry text offsets so with these you can write code (usually JS and CSS) to do your highlighting. native cTakes does not have any graphical display functionality. Another hint learned from experience. If you have many large texts (say, 20kb and above with lots of potential terms to discover), you can achieve much better throughput by breaking these into smaller chunks at sentence boundaries and tweaking offsets accordingly as you reassemble the chunks. The memory requirements grow rapidly with the size of the note. In summary, a strong developer background is a good starting point. To that you'd want to add medical informatics, and experience with scalable architectures. cTakes is a great kernel to your system but be prepared to dive deep. Peter On Thu, Aug 10, 2023 at 10:06 AM Paul Stearns wrote: > I am looking for a NLP to read pathology reports and extract cancer > related site, histology, stage and any other DX/RX data available. In > looking at CTakes, I have a few questions; > > - Is CTakes an appropriate tool to automate this task? > - The end goal would be a fully automated tool where text was presented to > an API and data was returned. > - An added bonus, would be for the tool to annotate the text, so that a > reviewer can more easily find the relevant data. > - For someone with a strong IT/software development background, but no NLP > background what is the level of difficulty in getting started with this > product? > > Paul R. Stearns > Advanced Consulting Enterprises, Inc. > 15150 NW 79th Court, > Suite: 206 > Miami Lakes Fl, 33016 > > Voice: (305)623-0360 x107 > Fax: (305)623-4588 >