Hi Sean,
Thanks! I'll give it a whirl and let you know how it works out.

Best!

On Mon, Jan 25, 2021 at 8:48 AM Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Greg, Peter,
>
> I believe that the performance report comes from a
> CollectionProcessingEngine (CPE)
> https://uima.apache.org/d/uimaj-current/apidocs/org/apache/uima/collection/CollectionProcessingEngine.html
>
>
> I think that UIMA's CPE GUI runs the pipeline through a CPE - hence the
> tool's name, but that may have changed in recent years.
>
> The PipelineBuilder class in ctakes.core used by the PiperFileRunner could
> be changed to use this style of running a single-threaded pipeline - right
> now it uses a simpler UIMAFit method.
> The code changes are relatively minor, but obviously significant testing
> would be required.  The ctakes PipelineBuilder does use a CPE for
> multi-threaded pipelines, so there has already been some testing on that
> front.
>
> You can look at the ctakes PipelineBuilder run() method.  If you get rid
> of the if (threadCount==1) {..} else {   the the CPE will always be used.
> Then just add a cpe.getPerformanceReport() after cpe.process() you should
> have a ProcessTrace object.  This is where my guessing ends as I have never
> used a ProcessTrace and don't know exactly what to beg of it.
>
> I hope that is a decent start,
> Sean
> ________________________________________
> From: Greg Silverman <g...@umn.edu.INVALID>
> Sent: Saturday, January 23, 2021 3:01 PM
> To: dev@ctakes.apache.org
> Subject: Re: performance report [EXTERNAL]
>
> * External Email - Caution *
>
>
> Hi Peter,
> I have no doubt about performance differences regarding variance between
> note styles and pipeline components.
>
> We're looking for a way to benchmark the standard/non-customized pipeline
> performance for processing a largish set of identical notes using several
> clinical NLP annotators (specifically, ctakes, biomedicus, metamap and
> clamp). At the command line, both metamap and biomedicus output a standard
> performance report with total timings and the details for each specific
> pipeline component. I assume there is a way to enable the performance
> report output available in the GUI version of ctakes at the command line -
> which is what I'm really interested in.
>
> We're fine with information at a very coarse level, since we're interested
> in a particular note type, so the aforementioned report should be
> sufficient. I'm just wondering how to enable it using the standard pipeline
> in cTAKES.
>
> Thanks!
>
> Greg--
>
>
>
> On Sat, Jan 23, 2021 at 12:26 PM Peter Abramowitsch <
> pabramowit...@gmail.com>
> wrote:
>
> > Hi Greg,
> >
> > I’ve found that there’s so much difference between note styles that have
> > performance implications and so many interactions between pipeline
> > configurations which affect overall performance, that really the only way
> > to get a sense of performance is either on a vary coarse level, measuring
> > process time across large collections of varied notes, or very granular
> > using something like jvisualvm.   Using the latter I saw some surprising
> > things, some of which I was able to tackle with minor software changes,
> > while others are deep in UIMA utilities used by cTakes..  The biggest
> > factor in my experience after processing millions of notes is after they
> > have reached about 5k AND are missing punctuation.  At around this size
> > begins a geometric rise in complexity of internal structures that depend
> on
> > sentences and a serious elevation of processing time.
> >
> > Peter
> >
> > Sent from my iPad
> >
> > > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote:
> > >
> > > I found this:
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__medium.com_-40felix-5Fchan_install-2Dapache-2Dctakes-2D924c40967ce2&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=s-jUaTKHh4ts1f2UzY5nHsKbjA27HDpqAchBF36juTI&e=
> , which
> > > states: "A performance report is generated when the process is done."
> > >
> > > However, we are running this from the command line and no such report
> is
> > > being generated.
> > >
> > > Thanks!
> > >
> > >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote:
> > >>
> > >> Hi all,
> > >> Is there a way to easily generate a performance report similar to the
> > one
> > >> generated by MetaMap (with timings for each task, etc.)?
> > >>
> > >> Thanks in advance!
> > >>
> > >> Greg--
> > >>
> > >> --
> > >> Greg M. Silverman
> > >> Senior Systems Developer
> > >> NLP/IE <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> >
> > >> Department of Surgery
> > >> University of Minnesota
> > >> g...@umn.edu
> > >>
> > >>
> > >
> > > --
> > > Greg M. Silverman
> > > Senior Systems Developer
> > > NLP/IE <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> >
> > > Department of Surgery
> > > University of Minnesota
> > > g...@umn.edu
> >
>
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> >
> Department of Surgery
> University of Minnesota
> g...@umn.edu
>


-- 
Greg M. Silverman
Senior Systems Developer
NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
Department of Surgery
University of Minnesota
g...@umn.edu

Reply via email to