Hi Sean, I was able to run the pipeline yesterday afternoon. Here is how, and still a little baffled:
1) There is a noticeable difference in the built code environments in Dev and Prod: In Prod, under the exploded apache cTakes 4.0.01 src directory, is the module ctakes-distribution with a target/ctakes-4.0.0.1/lib directory with all of the maven library jar files, and above this path directly under the target directory, are the cTakes core Jar, tar.gz, zips that were generated. In Dev, there is no /target/ctakes-4.0.0.1 sub-directory, instead only the core jar, tar gz, zips that were generated. 2) I was able to run the cTakes pipeline only after I updated the class path to include /opt/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/ctakes-4.0.0.1/lib/*, in addition to a location where we store all of the desc, dictionaries, hibernate connection properties, etc, and only after I copied the /target/ctakes-4.0.0.1-core-with-dependencies.jar into the /opt/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/ctakes-4.0.01/lib/ path, now, the code runs perfectly fine. If I do not copy the core jar to this location, it cannot load the primary class which begins running the CPE processing to read the CPE descriptor file. Now I need to likely create a new empty directory, expand source here, copy in our module + updated pom, and build to verify what got populated into the target sub-directory under ctakes-distribution. If I only produce the jars, then there is some missing build step, or otherwise missing details that I will need to figure out, to ensure we can build new code, and then run this code. Yes - this xyzApp is a proprietary code module we wrote, and it is dependent on the other underlying out of the box cTake modules. Thanks, Ryan -----Original Message----- From: Finan, Sean <sean.fi...@childrens.harvard.edu.INVALID> Sent: Wednesday, July 31, 2024 11:39 AM To: dev@ctakes.apache.org Subject: Re: Error running Apache cTakes Pipeline [EXTERNAL] Hi Ryan, As far as I know, none of the [xyz] code that you are referencing comes from ctakes. You probably need to find out where that code came from - hopefully your organization has a good relationship with the previous developer or there is a user in your organization who is familiar with that code and its purpose. It is possible that there are comments in the scripts and/or code under the xyz directory that can point you in the correct direction. A web search for `ctakes "xyzapp"` came up empty, but that is as far as I went. It is possible that somebody else monitoring this mailing list knows something about xyzapp. You could try writing or reposting with "What is XYZAPP" and see if it catches an eye. Good luck, Sean ________________________________ From: Ryan Swenson <rswen...@nsightlabs.com> Sent: Tuesday, July 30, 2024 2:21 PM To: dev@ctakes.apache.org <dev@ctakes.apache.org> Subject: RE: Error running Apache cTakes Pipeline [EXTERNAL] * External Email - Caution * It appears one key differences now between Dev (code errors ) and Prod(code runs) envs is that in the CTAKES_HOME variable inside a script for the Java Class Path, there is a -cp path to: /opt/xyzapp/nlp/c-takes/nlp/apache-ctakes-src/ctakes-distribution/target/apache-ctakes-4.0.0.1/lib which is present in Prod and not in Dev... This subfolder contains a bin directory, which contains : ant, ant.bat, OpenCmd.bat, runctakesCPE.bat , and matching .sh scripts Do you know if there is some build step in which the apache 4.0.0.1 binary is supposed to be pulled down in combo with the source code, and/or what is accounting for why this is not present in our Dev system but is in Prod? Thanks, Ryan From: Ryan Swenson Sent: Tuesday, July 30, 2024 1:31 PM To: dev@ctakes.apache.org Subject: Error running Apache cTakes Pipeline Hello, I currently inherited an existing Apache cTakes Pipeline Application from a previous developer who is no longer with the parent organization I am supporting in the Molecular Genetics space. The organization has a built and runnable code, running on a scheduled basis in production. Separately, in a Development environment we are wanting to address defects, add new features, and also address security exceptions and vulnerabilities which we can later QA and roll into Production. At the moment in Development we are experiencing an error in trying to run our Apache cTakes pipeline with the same sources and binaries taken from prod. Both Dev and Prod are using Maven 3.9.8 for building and JDK 8 for build and execution. The specific error (some info redacted for sensitivity , some for brevity) is: Exception in thread "main" org.apache.uima.resource.ResourceInitializationException: Initialization of CAS Processor with name "XYZ Aggregate Engine" failed. Caused by: org.apache.uima.resource.ResourceInitializationException: Initalization of annotator class "com.xyz.XyzSetenceRegexAnnotator" failed. (Descriptor: file:/opt/xyzapp/scripts/resources/cTakes/desc/XyzSetenceRegexAnnotator.xml) ... 8 more Caused by: org.springframework.beans.factory.access.BootstrapException: Unable to initialize group definition. Group resource name [classpath* org/apache/ctakes/ytex/uima/beanRefContext.xml]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name "ytexApplicationContext" defined in URL [jar: /opt/xyzapp/nlp/nlp-c-takes/apache-ctakes-4.00.1-src/ctakes-distribution/target/apache-ctakes-4.0.0.1-jar-with-dependencies.jar!/org/apache/ytex/uima/beanRefContext.xml]: Instantiation of bean failed .... Unable to locate Spring NamespaceHandler for XML schema namespace [https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense.com_v3_-5F-5Fhttp-3A__www.springframework.org_schema_aop-5F-5F-3B-21-21NZvER7FxgEiBAiR-5F-21q1DN4cI8AqQiOYzffEQr625-2DfRyUn3ZuW9-5F2n4rAETRIkSMGENqaMIBQfEl4r-5F7OaqvNdKJygrKyc-2DLnobM08r-2D8RDQVhnqr-24&d=DwIFAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=9Flt9bUM7tMBZDvg9xczhW6mZk9UQk-kBW0vDIgrYiM&m=bY5OjS4IQEXAyjB8cdJT-oecfNUcTS1f1Z9WEZwGw8D7nw5qZjU8CGJZtkX1deVy&s=zoBMArhSqim5Heqvxmb9UOVu4NY9Lnhl-BEkXTLTvLo&e= ] offending resource: class path resource [org/apache/ctakes/ytex/beans-kernel.xml] ...41 more ... To execute I run: Java -cp "/opt/xyzapp/scripts/resources/*:/opt/xyzapp/nlp/nlp-c-takes/apache-ctakes-4.0.0.1-src/ctakes-distribution/target/* -Dorg.apache-ctakes.ytex.conceptGraphDir='pwd' org.xyz.pipeline.RunCPE /opt/xyzapp/scripts/resources/cTakes/CPEDescriptor.xml I experience this error, if I take the apache-ctakes-4.0.0.1-jar-with-dependencies.jar from Prod where the pipeline is successfully running and place it in the target dir, and I also experience it if I build in the development environment, and use the apache-ctakes-4.0.0.1-jar-with-dependencies.jar produced in this target directory. I am also evaluating if the Production shell scripts used for setting environment variables, paths, and the Java class path is the same ( which I suspect is not). If anyone can provide me with more clues or advice on this, I would greatly appreciate it. The cTakes portion of this application and pipeline is 1 of 4x separate applications, and if we are able to run/build cTakes we will be able to proceed with maintaining the application portfolio. Thanks, Ryan