Hi Sean,

I was able to run the pipeline yesterday afternoon. Here is how, and still a 
little baffled:

1) There is a noticeable difference in the built code environments in Dev and 
Prod:  In Prod, under the exploded apache cTakes 4.0.01 src directory, is the 
module ctakes-distribution with a target/ctakes-4.0.0.1/lib directory with all 
of the maven library jar files, and above this path directly under the target 
directory, are the cTakes core Jar, tar.gz, zips that were generated.   In Dev, 
there is no /target/ctakes-4.0.0.1 sub-directory, instead only the core jar, 
tar gz, zips that were generated.   
2) I was able to run the cTakes pipeline only after I updated the class path to 
include 
/opt/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/ctakes-4.0.0.1/lib/*, 
in addition to a location where we store all of the desc, dictionaries, 
hibernate connection properties, etc, and only after I copied the 
/target/ctakes-4.0.0.1-core-with-dependencies.jar into the 
/opt/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/ctakes-4.0.01/lib/ 
path, now, the code runs perfectly fine.  If I do not copy the core jar to this 
location, it cannot load the primary class which begins running the CPE 
processing to read the CPE descriptor file. 

Now I need to likely create a new empty directory, expand source here, copy in 
our module + updated pom, and build to verify what got populated into the 
target sub-directory under ctakes-distribution. If I only produce the jars, 
then there is some missing build step, or otherwise missing details that I will 
need to figure out, to ensure we can build new code, and then run this code. 

Yes - this xyzApp is a proprietary code module we wrote, and it is dependent on 
the other underlying out of the box cTake modules. 

Thanks, 
Ryan 
-----Original Message-----
From: Finan, Sean <sean.fi...@childrens.harvard.edu.INVALID> 
Sent: Wednesday, July 31, 2024 11:39 AM
To: dev@ctakes.apache.org
Subject: Re: Error running Apache cTakes Pipeline [EXTERNAL]

Hi Ryan,

As far as I know, none of the [xyz] code that you are referencing comes from 
ctakes.  You probably need to find out where that code came from - hopefully 
your organization has a good relationship with the previous developer or there 
is a user in your organization who is familiar with that code and its purpose.  
It is possible that there are comments in the scripts and/or code under the xyz 
directory that can point you in the correct direction.  A web search for 
`ctakes "xyzapp"` came up empty, but that is as far as I went.

It is possible that somebody else monitoring this mailing list knows something 
about xyzapp.  You could try writing or reposting with "What is XYZAPP" and see 
if it catches an eye.

Good luck,
Sean
________________________________
From: Ryan Swenson <rswen...@nsightlabs.com>
Sent: Tuesday, July 30, 2024 2:21 PM
To: dev@ctakes.apache.org <dev@ctakes.apache.org>
Subject: RE: Error running Apache cTakes Pipeline [EXTERNAL]

* External Email - Caution *



It appears one key differences now between Dev (code errors ) and Prod(code 
runs) envs is that in the CTAKES_HOME variable inside a script for the Java 
Class Path,  there is a -cp path to: 
/opt/xyzapp/nlp/c-takes/nlp/apache-ctakes-src/ctakes-distribution/target/apache-ctakes-4.0.0.1/lib
 which is present in Prod and not in Dev...

This subfolder contains a bin directory, which contains : ant, ant.bat, 
OpenCmd.bat, runctakesCPE.bat , and matching .sh scripts

Do you know if there is some build step in which the apache 4.0.0.1 binary is 
supposed to be pulled down in combo with the source code, and/or what is 
accounting for why this is not present in our Dev system but is in Prod?

Thanks,
Ryan

From: Ryan Swenson
Sent: Tuesday, July 30, 2024 1:31 PM
To: dev@ctakes.apache.org
Subject: Error running Apache cTakes Pipeline

Hello,

I currently inherited an existing Apache cTakes Pipeline Application from a 
previous developer who is no longer with the parent organization I am 
supporting in the Molecular Genetics space.  The organization has a built and 
runnable code, running on a scheduled basis in production.  Separately, in a 
Development environment we are wanting to address defects, add new features, 
and also address security exceptions and vulnerabilities which we can later QA 
and roll into Production.

At the moment in Development we are experiencing an error in trying to run our 
Apache cTakes pipeline with the same sources and binaries taken from prod.  
Both Dev and Prod are using Maven 3.9.8 for building and JDK 8 for build and 
execution.

The specific error (some info redacted for sensitivity , some for brevity) is:

Exception in thread "main" 
org.apache.uima.resource.ResourceInitializationException: Initialization of CAS 
Processor with name "XYZ Aggregate Engine" failed.
Caused by: org.apache.uima.resource.ResourceInitializationException: 
Initalization of annotator class "com.xyz.XyzSetenceRegexAnnotator" failed. 
(Descriptor: 
file:/opt/xyzapp/scripts/resources/cTakes/desc/XyzSetenceRegexAnnotator.xml)
... 8 more
Caused by: org.springframework.beans.factory.access.BootstrapException: Unable 
to initialize group definition. Group resource name [classpath* 
org/apache/ctakes/ytex/uima/beanRefContext.xml]; nested exception is 
org.springframework.beans.factory.BeanCreationException: Error creating bean 
with name "ytexApplicationContext" defined in URL [jar: 
/opt/xyzapp/nlp/nlp-c-takes/apache-ctakes-4.00.1-src/ctakes-distribution/target/apache-ctakes-4.0.0.1-jar-with-dependencies.jar!/org/apache/ytex/uima/beanRefContext.xml]:
 Instantiation of bean failed ....  Unable to locate Spring NamespaceHandler 
for XML schema namespace 
[https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.com_v3_-5F-5Fhttp-3A__www.springframework.org_schema_aop-5F-5F-3B-21-21NZvER7FxgEiBAiR-5F-21q1DN4cI8AqQiOYzffEQr625-2DfRyUn3ZuW9-5F2n4rAETRIkSMGENqaMIBQfEl4r-5F7OaqvNdKJygrKyc-2DLnobM08r-2D8RDQVhnqr-24&d=DwIFAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=9Flt9bUM7tMBZDvg9xczhW6mZk9UQk-kBW0vDIgrYiM&m=bY5OjS4IQEXAyjB8cdJT-oecfNUcTS1f1Z9WEZwGw8D7nw5qZjU8CGJZtkX1deVy&s=zoBMArhSqim5Heqvxmb9UOVu4NY9Lnhl-BEkXTLTvLo&e=
 ] offending resource: class path resource 
[org/apache/ctakes/ytex/beans-kernel.xml]
...41 more ...


To execute I run:

Java -cp 
"/opt/xyzapp/scripts/resources/*:/opt/xyzapp/nlp/nlp-c-takes/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/*
 -Dorg.apache-ctakes.ytex.conceptGraphDir='pwd' org.xyz.pipeline.RunCPE 
/opt/xyzapp/scripts/resources/cTakes/CPEDescriptor.xml

I experience this error, if I take the 
apache-ctakes-4.0.0.1-jar-with-dependencies.jar from Prod where the pipeline is 
successfully running and place it in the target dir,  and I also experience it 
if I build in the development environment, and use the 
apache-ctakes-4.0.0.1-jar-with-dependencies.jar produced in this target 
directory.

I am also evaluating if the Production shell scripts used for setting 
environment variables, paths, and the Java class path is the same ( which I 
suspect is not).

If anyone can provide me with more clues or advice on this, I would greatly 
appreciate it.  The cTakes portion of this application and pipeline is 1 of 4x 
separate applications, and if we are able to run/build cTakes we will be able 
to proceed with maintaining the application portfolio.

Thanks,
Ryan



Reply via email to