Re: ytex DBconsumer and groovy parser

Richard Eckart de Castilho Tue, 01 Jul 2014 22:12:07 -0700

Hi John,

there is actually no grand difference between analysis engines and consumers.

Per default, a UIMA runtime may create multiple instances of an analysis engine 
and run them in parallel (if the runtime supports that),
but a "consumer" must see all data going through the pipeline, so there can 
only be once instance.

The default value of flag about being allowing multiple instances or not is the 
only real difference.

Basically any analysis engine that does only read annotations from the CAS but 
not add/change anything is a consumer. Consequently, a consumer can be added 
anywhere in the pipeline, not only at the end (I sometimes do that to see 
intermediate results).

If a component has the "allow multiple instances" flag set to "false" (which is 
usually what you want), then runtimes may react to that differently. E.g. the 
Collection Processing Engine (CPE) will single-thread all components (analysis 
engines or consumers) after it hits the first component with "allow multiple 
instances" set to false (which is typically a consumer). So to make optimal use 
of the CPEs multi-threading capabilities, such components should be towards the 
end of the CPE pipeline.

I believe there is a Java interface declaration and base classes for 
"CasConsumers" in UIMA - I haven't used these in years. The uimaFIT API doesn't 
even support these because everything can also be (and is within uimaFIT) 
nicely modeled using analysis engines and the "allow multiple instances" flag.

Cheers,

-- Richard

On 02.07.2014, at 04:01, Masanz, James J. <masanz.ja...@mayo.edu> wrote:

> Hi John,
> 
> Not positive this is the line you are referring to, but there is a line in 
> cTAKES_clinical_pipeline.groovy (which is not in sandbox, btw) that has a 
> comment about 
> 
> "createAnalysisEngineDescription  expects name to not end in .xml even though 
> filename actually does"
> 
> I am guessing the comment you see is trying to say the same thing. 
> 
> cTAKES_clinical_pipeline.groovy is in  ctakes-core/scripts/groovy
> 
> In that script, line 321 is where the writer is specified. There is no 
> separately defined "consumer" in the same sense that the CPE GUI has 
> consumers that are separate from annotators. The script just uses the last 
> "annotator"  as a consumer and convention is AFAIK to call them writers in 
> this case.
> 
> Hope that helps,
> -- James
> 
> -----Original Message-----
> From: John Green [mailto:john.travis.gr...@gmail.com] 
> Sent: Tuesday, July 01, 2014 7:15 PM
> To: dev@ctakes.apache.org
> Subject: ytex DBconsumer and groovy parser
> 
> If someone has a free minute, which, judging from my own life is probably
> not the case - where in the groovy scrips in sandbox do you define the
> consumer to use? There is one comment that says "dont put the .xml here"
> then there is a path to the dictionary ae. Im working by ssh from the
> hospital a lot in my "free time" in the ICU and running gui CPEs isn't
> gonna cut it.
> 
> Apropos the ytex dbconsumer - I should be able to just tack this on to the
> end of the ytex aggregate pipeline?
> 
> I'm probably still asking very naive questions but to date I still haven't
> had the time to dive into UIMA's base very well, so I apologize.
> 
> My goal is to run the full ytex pipeline from the command line with the
> ytex dbconsumer ...
> 
> Thanks for everyone's patience,
> John

Re: ytex DBconsumer and groovy parser

Reply via email to