Hi Greg, I’ve found that there’s so much difference between note styles that have performance implications and so many interactions between pipeline configurations which affect overall performance, that really the only way to get a sense of performance is either on a vary coarse level, measuring process time across large collections of varied notes, or very granular using something like jvisualvm. Using the latter I saw some surprising things, some of which I was able to tackle with minor software changes, while others are deep in UIMA utilities used by cTakes.. The biggest factor in my experience after processing millions of notes is after they have reached about 5k AND are missing punctuation. At around this size begins a geometric rise in complexity of internal structures that depend on sentences and a serious elevation of processing time.
Peter Sent from my iPad > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote: > > I found this: > https://medium.com/@felix_chan/install-apache-ctakes-924c40967ce2, which > states: "A performance report is generated when the process is done." > > However, we are running this from the command line and no such report is > being generated. > > Thanks! > >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote: >> >> Hi all, >> Is there a way to easily generate a performance report similar to the one >> generated by MetaMap (with timings for each task, etc.)? >> >> Thanks in advance! >> >> Greg-- >> >> -- >> Greg M. Silverman >> Senior Systems Developer >> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> >> Department of Surgery >> University of Minnesota >> g...@umn.edu >> >> > > -- > Greg M. Silverman > Senior Systems Developer > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > Department of Surgery > University of Minnesota > g...@umn.edu