Hi Greg,

I’ve found that there’s so much difference between note styles that have 
performance implications and so many interactions between pipeline 
configurations which affect overall performance, that really the only way to 
get a sense of performance is either on a vary coarse level, measuring  process 
time across large collections of varied notes, or very granular using something 
like jvisualvm.   Using the latter I saw some surprising things, some of which 
I was able to tackle with minor software changes, while others are deep in UIMA 
utilities used by cTakes..  The biggest factor in my experience after 
processing millions of notes is after they have reached about 5k AND are 
missing punctuation.  At around this size begins a geometric rise in complexity 
of internal structures that depend on sentences and a serious elevation of 
processing time. 

Peter

Sent from my iPad

> On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote:
> 
> I found this:
> https://medium.com/@felix_chan/install-apache-ctakes-924c40967ce2, which
> states: "A performance report is generated when the process is done."
> 
> However, we are running this from the command line and no such report is
> being generated.
> 
> Thanks!
> 
>> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote:
>> 
>> Hi all,
>> Is there a way to easily generate a performance report similar to the one
>> generated by MetaMap (with timings for each task, etc.)?
>> 
>> Thanks in advance!
>> 
>> Greg--
>> 
>> --
>> Greg M. Silverman
>> Senior Systems Developer
>> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
>> Department of Surgery
>> University of Minnesota
>> g...@umn.edu
>> 
>> 
> 
> -- 
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> Department of Surgery
> University of Minnesota
> g...@umn.edu

Reply via email to