Why not use Docker and versioning by tags? See "C. Boettiger, An introduction to Docker for reproducible research, SIGOPS Oper. Syst. Rev. 49 (2015) 71–79. doi:10.1145/2723872.2723882. <https://www.zotero.org/google-docs/?Xd3H9e>"
On Fri, Oct 21, 2022 at 3:15 PM Peter Abramowitsch <pabramowit...@gmail.com> wrote: > Well, obviously, the full range of permutations of all source files and all > annotators and pre and post ctakes code would require a huge amount of > commit information on thousands of files... and not only ctakes > files...recently I made some pretty significant changes to the ZonerCli > library which is only a dependency of the ctakes distribution. How would > all the commit info be used to tag the end results. I think the answer is > that it's simply not feasible or useful. So we haven't gone to those > lengths. As far as we go at the UCs is to version the piper file and then > write the versioned_name of the piper back into the json object returned > for each note... We have our own rest service and our own Java and Python > clients, but they don't touch the internals of the message in a way that > interferes with the clinical informatics. The note concept collection > object with its piper version is then persisted in our data store. The > server jar also has a version which writes into a log and is updated > whenever any significant framework changes are implemented. But the > server version is not written into the data-store. > > Not sure if any of this was helpful > > On Fri, Oct 21, 2022 at 8:03 PM Miller, Timothy > <timothy.mil...@childrens.harvard.edu.invalid> wrote: > > > We’ve recently been using cTAKES for some internal projects where we make > > modifications, often using the REST server, combined with an open-source > > python client that makes the output of the REST server easy to > post-process: > > > https://github.com/Machine-Learning-for-Medical-Language/ctakes-client-py > > written by my colleagues Andy McMurry and Mike Terry, and pip > installable. > > The output is then either converted to FHIR or written to whatever > > convenient format we need. > > > > But it’s useful to know for a given run on a given project, what was the > > NLP configuration that produced this output? Obviously, there are things > > like version numbers, but since cTAKES is highly configurable, and our > > post-processing libraries have versions, and we may use trunk or a > previous > > commit instead of releases, things get complicated quickly. Does anyone > > have an existing solution they are willing to share? Or does anyone have > > any thoughts on this topic? This question goes slightly beyond cTAKES, > but > > cTAKES is responsible for a lot of the complexity in figuring this out > > since it’s the most configurable component. > > > > Thanks > > Tim > > > > > -- Greg M. Silverman Senior Systems Developer NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> Department of Surgery University of Minnesota g...@umn.edu