Re: performance report [EXTERNAL]

Peter Abramowitsch Mon, 25 Jan 2021 08:28:57 -0800

Great, thanks Greg.  I'd like to see the kind of stats that are available
beyond what one can scrape from log4j


Peter

On Mon, Jan 25, 2021 at 5:16 PM Greg Silverman <g...@umn.edu.invalid> wrote:

> Hi Sean,
> Thanks! I'll give it a whirl and let you know how it works out.
>
> Best!
>
> On Mon, Jan 25, 2021 at 8:48 AM Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
>
> > Hi Greg, Peter,
> >
> > I believe that the performance report comes from a
> > CollectionProcessingEngine (CPE)
> >
> https://uima.apache.org/d/uimaj-current/apidocs/org/apache/uima/collection/CollectionProcessingEngine.html
> >
> >
> > I think that UIMA's CPE GUI runs the pipeline through a CPE - hence the
> > tool's name, but that may have changed in recent years.
> >
> > The PipelineBuilder class in ctakes.core used by the PiperFileRunner
> could
> > be changed to use this style of running a single-threaded pipeline -
> right
> > now it uses a simpler UIMAFit method.
> > The code changes are relatively minor, but obviously significant testing
> > would be required.  The ctakes PipelineBuilder does use a CPE for
> > multi-threaded pipelines, so there has already been some testing on that
> > front.
> >
> > You can look at the ctakes PipelineBuilder run() method.  If you get rid
> > of the if (threadCount==1) {..} else {   the the CPE will always be used.
> > Then just add a cpe.getPerformanceReport() after cpe.process() you should
> > have a ProcessTrace object.  This is where my guessing ends as I have
> never
> > used a ProcessTrace and don't know exactly what to beg of it.
> >
> > I hope that is a decent start,
> > Sean
> > ________________________________________
> > From: Greg Silverman <g...@umn.edu.INVALID>
> > Sent: Saturday, January 23, 2021 3:01 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: performance report [EXTERNAL]
> >
> > * External Email - Caution *
> >
> >
> > Hi Peter,
> > I have no doubt about performance differences regarding variance between
> > note styles and pipeline components.
> >
> > We're looking for a way to benchmark the standard/non-customized pipeline
> > performance for processing a largish set of identical notes using several
> > clinical NLP annotators (specifically, ctakes, biomedicus, metamap and
> > clamp). At the command line, both metamap and biomedicus output a
> standard
> > performance report with total timings and the details for each specific
> > pipeline component. I assume there is a way to enable the performance
> > report output available in the GUI version of ctakes at the command line
> -
> > which is what I'm really interested in.
> >
> > We're fine with information at a very coarse level, since we're
> interested
> > in a particular note type, so the aforementioned report should be
> > sufficient. I'm just wondering how to enable it using the standard
> pipeline
> > in cTAKES.
> >
> > Thanks!
> >
> > Greg--
> >
> >
> >
> > On Sat, Jan 23, 2021 at 12:26 PM Peter Abramowitsch <
> > pabramowit...@gmail.com>
> > wrote:
> >
> > > Hi Greg,
> > >
> > > I’ve found that there’s so much difference between note styles that
> have
> > > performance implications and so many interactions between pipeline
> > > configurations which affect overall performance, that really the only
> way
> > > to get a sense of performance is either on a vary coarse level,
> measuring
> > > process time across large collections of varied notes, or very granular
> > > using something like jvisualvm.   Using the latter I saw some
> surprising
> > > things, some of which I was able to tackle with minor software changes,
> > > while others are deep in UIMA utilities used by cTakes..  The biggest
> > > factor in my experience after processing millions of notes is after
> they
> > > have reached about 5k AND are missing punctuation.  At around this size
> > > begins a geometric rise in complexity of internal structures that
> depend
> > on
> > > sentences and a serious elevation of processing time.
> > >
> > > Peter
> > >
> > > Sent from my iPad
> > >
> > > > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid>
> wrote:
> > > >
> > > > I found this:
> > > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__medium.com_-40felix-5Fchan_install-2Dapache-2Dctakes-2D924c40967ce2&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=s-jUaTKHh4ts1f2UzY5nHsKbjA27HDpqAchBF36juTI&e=
> > , which
> > > > states: "A performance report is generated when the process is done."
> > > >
> > > > However, we are running this from the command line and no such report
> > is
> > > > being generated.
> > > >
> > > > Thanks!
> > > >
> > > >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu>
> wrote:
> > > >>
> > > >> Hi all,
> > > >> Is there a way to easily generate a performance report similar to
> the
> > > one
> > > >> generated by MetaMap (with timings for each task, etc.)?
> > > >>
> > > >> Thanks in advance!
> > > >>
> > > >> Greg--
> > > >>
> > > >> --
> > > >> Greg M. Silverman
> > > >> Senior Systems Developer
> > > >> NLP/IE <
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> > >
> > > >> Department of Surgery
> > > >> University of Minnesota
> > > >> g...@umn.edu
> > > >>
> > > >>
> > > >
> > > > --
> > > > Greg M. Silverman
> > > > Senior Systems Developer
> > > > NLP/IE <
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> > >
> > > > Department of Surgery
> > > > University of Minnesota
> > > > g...@umn.edu
> > >
> >
> >
> > --
> > Greg M. Silverman
> > Senior Systems Developer
> > NLP/IE <
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> > >
> > Department of Surgery
> > University of Minnesota
> > g...@umn.edu
> >
>
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> Department of Surgery
> University of Minnesota
> g...@umn.edu
>

Re: performance report [EXTERNAL]

Reply via email to