Well, obviously, the full range of permutations of all source files and all annotators and pre and post ctakes code would require a huge amount of commit information on thousands of files... and not only ctakes files...recently I made some pretty significant changes to the ZonerCli library which is only a dependency of the ctakes distribution. How would all the commit info be used to tag the end results. I think the answer is that it's simply not feasible or useful. So we haven't gone to those lengths. As far as we go at the UCs is to version the piper file and then write the versioned_name of the piper back into the json object returned for each note... We have our own rest service and our own Java and Python clients, but they don't touch the internals of the message in a way that interferes with the clinical informatics. The note concept collection object with its piper version is then persisted in our data store. The server jar also has a version which writes into a log and is updated whenever any significant framework changes are implemented. But the server version is not written into the data-store.
Not sure if any of this was helpful On Fri, Oct 21, 2022 at 8:03 PM Miller, Timothy <timothy.mil...@childrens.harvard.edu.invalid> wrote: > We’ve recently been using cTAKES for some internal projects where we make > modifications, often using the REST server, combined with an open-source > python client that makes the output of the REST server easy to post-process: > https://github.com/Machine-Learning-for-Medical-Language/ctakes-client-py > written by my colleagues Andy McMurry and Mike Terry, and pip installable. > The output is then either converted to FHIR or written to whatever > convenient format we need. > > But it’s useful to know for a given run on a given project, what was the > NLP configuration that produced this output? Obviously, there are things > like version numbers, but since cTAKES is highly configurable, and our > post-processing libraries have versions, and we may use trunk or a previous > commit instead of releases, things get complicated quickly. Does anyone > have an existing solution they are willing to share? Or does anyone have > any thoughts on this topic? This question goes slightly beyond cTAKES, but > cTAKES is responsible for a lot of the complexity in figuring this out > since it’s the most configurable component. > > Thanks > Tim > >