Hi Peter,
I have no doubt about performance differences regarding variance between
note styles and pipeline components.

We're looking for a way to benchmark the standard/non-customized pipeline
performance for processing a largish set of identical notes using several
clinical NLP annotators (specifically, ctakes, biomedicus, metamap and
clamp). At the command line, both metamap and biomedicus output a standard
performance report with total timings and the details for each specific
pipeline component. I assume there is a way to enable the performance
report output available in the GUI version of ctakes at the command line -
which is what I'm really interested in.

We're fine with information at a very coarse level, since we're interested
in a particular note type, so the aforementioned report should be
sufficient. I'm just wondering how to enable it using the standard pipeline
in cTAKES.

Thanks!

Greg--



On Sat, Jan 23, 2021 at 12:26 PM Peter Abramowitsch <pabramowit...@gmail.com>
wrote:

> Hi Greg,
>
> I’ve found that there’s so much difference between note styles that have
> performance implications and so many interactions between pipeline
> configurations which affect overall performance, that really the only way
> to get a sense of performance is either on a vary coarse level, measuring
> process time across large collections of varied notes, or very granular
> using something like jvisualvm.   Using the latter I saw some surprising
> things, some of which I was able to tackle with minor software changes,
> while others are deep in UIMA utilities used by cTakes..  The biggest
> factor in my experience after processing millions of notes is after they
> have reached about 5k AND are missing punctuation.  At around this size
> begins a geometric rise in complexity of internal structures that depend on
> sentences and a serious elevation of processing time.
>
> Peter
>
> Sent from my iPad
>
> > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote:
> >
> > I found this:
> > https://medium.com/@felix_chan/install-apache-ctakes-924c40967ce2, which
> > states: "A performance report is generated when the process is done."
> >
> > However, we are running this from the command line and no such report is
> > being generated.
> >
> > Thanks!
> >
> >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote:
> >>
> >> Hi all,
> >> Is there a way to easily generate a performance report similar to the
> one
> >> generated by MetaMap (with timings for each task, etc.)?
> >>
> >> Thanks in advance!
> >>
> >> Greg--
> >>
> >> --
> >> Greg M. Silverman
> >> Senior Systems Developer
> >> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> >> Department of Surgery
> >> University of Minnesota
> >> g...@umn.edu
> >>
> >>
> >
> > --
> > Greg M. Silverman
> > Senior Systems Developer
> > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> > Department of Surgery
> > University of Minnesota
> > g...@umn.edu
>


-- 
Greg M. Silverman
Senior Systems Developer
NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
Department of Surgery
University of Minnesota
g...@umn.edu

Reply via email to