Hi Ben If you watch my presentation from ApacheCon you'll see how we went about mass extraction of notes. This video contains two presentations and mine starts about halfway through: https://www.youtube.com/watch?v=F5WCCPWz7Z0
But in the same conference thread, there were two other groups working on similar projects, but using different approaches. here's one of them https://www.youtube.com/watch?v=kZw42pGzyHs Peter On Thu, Jul 29, 2021 at 11:25 AM Benjamin hansen <benjaminkakke...@gmail.com> wrote: > Thank you for these insights Peter. > Your project sounds very interesting. Are you using the uima pipeline on a > cluster to process that many notes? And how long does it take? > > I have been considering to use uimafit+ctakes together with apache spark > for distributed computations. I saw a video from Philip Ogren from 2014 > describing this - but unfortunately he does not give any details on how he > did this. > Would you by any chance know where i can find more information about how to > achieve this? > > Best regards > > On Thu, Jul 29, 2021 at 6:21 PM Peter Abramowitsch < > pabramowit...@gmail.com> > wrote: > > > Hi Ben, > > > > I can only speak for myself, but I am using cTakes extensively at two > major > > California Universities in multiple projects. The kind of customizations > I > > am doing are mostly specific to the facility and to the project and > > therefore wouldn't be for inclusion in the source repository. We have > > just finished using it to extract concepts from 102 million notes > > > > Unless I am wrong, updates are going into the Apache SVN repository and > > Github has acted as a backup repo. It's true that there isn't a well > > organized update bugfix & release team and schedule. Others can speak to > > that too, but I would suspect that part of this is due to the fact that > the > > core is very stable and most modifications and enhancements are, like > mine, > > local to a project. However, there has been talk but not much > definitive > > action on two initiatives - to upgrade to the current version of UIMA and > > to include the Ruta engine as one of the pluggable components. > > > > The user base is fairly substantial for a project this specialized. I > > suggest you have a look at the presentations at the cTakes thread of the > > 2020 ApacheCon conference. > > > > You'll have to search through this list: > > https://www.youtube.com/playlist?list=PLU2OcwpQkYCy_awEe5xwlxGTk5UieA37m > > > > By all means, come and join us! > > > > Regards, Peter > > > > > > > > On Thu, Jul 29, 2021 at 12:24 AM Benjamin hansen < > > benjaminkakke...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > > > > I hope its okay i send this mail here - i was not sure where else to > pose > > > my question. > > > > > > > > > We are considering to use cTakes in our applications - however I got > > > concerned when i saw the lack of activity in the github repository > which > > is > > > why I want to ask - > > > > > > > > > > > > Is cTakes still being actively developed and maintained or have > > developers > > > gone to develop other systems instead? > > > > > > > > > Is there any kind of up 2 date roadmap for the activity development of > > > cTakes? > > > > > > > > > What is the cTakes userbase like these days? Is it growing?, dwindling? > > > stable? non-existent? > > > > > > > > > > > > > > > Thanks in advance. > > > > > >