Hi Maral, > Are you using an IDE (Integrated Development Environment) such as IntelliJ > or Eclipse? > If so then you should be able to create a run profile that can run > pipelines. There is plenty of help online for that kind of thing, and > people on the mailing list can probably provide examples of what they have > used for ctakes. >
Sean ________________________________________ From: Maral Amir <maraljav...@gmail.com> Sent: Friday, July 19, 2019 4:52 PM To: dev@ctakes.apache.org Subject: Re: cTAKES Pipeline [EXTERNAL] Hi Sean, Thank you very much for your kind and prompt reply. I used build profile "runPiperGui"and it worked beautifully on my custom piper file. I appreciate it if you kindly direct me to next steps on more run methods. As I mentioned earlier, my final goal is to develop a OCR+NLP web service. Thanks, Maral On Fri, Jul 19, 2019 at 12:08 PM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Maral, > > There are two slightly different directory structures. One for > development (source structure), another for end use (installation > structure). > > Since you have a copy of the source, lets start with that. > > Are you using an IDE (Integrated Development Environment) such as IntelliJ > or Eclipse? > If so then you should be able to create a run profile that can run > pipelines. There is plenty of help online for that kind of thing, and > people on the mailing list can probably provide examples of what they have > used for ctakes. > > If you are not using an ide, I suggest using the -PrunPiperGui maven > profile that I mentioned below - just to run a test pipeline and see your > output. After you have a successful run then we can move on to other run > methods. > > When you use an ide run profile or a maven profile you don't need to > specify $CTAKES_HOME or worry about the classpath or bin/. > > Sean > > ________________________________________ > From: Maral Amir <maraljav...@gmail.com> > Sent: Friday, July 19, 2019 2:12 PM > To: dev@ctakes.apache.org > Subject: Re: cTAKES Pipeline [EXTERNAL] > > Hi Sean, > > Thank you so much for your insightful response. > I'm having a problem linking the piper files. I should mention I am using > command line interface. Could you please kindly let me know: > > 1. What I should set my CTAKES_HOME variable into. Right now I set my > CTAKES_HOME to my cTAKES user installation main folder. That is because I > could see in the last line of the runPiperFile.sh, the class directory > $CTAKES_HOME/lib/* is included and no /lib folder is present in the > developer's version. > > java -cp > > $CTAKES_HOME/desc/:$CTAKES_HOME/resources/:$CTAKES_HOME/resources/resources:$CTAKES_HOME/lib/* > -Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms512M -Xmx3g > org.apache.ctakes.core.pipeline.PiperFileRunner "$@" > > > Also, > > 2. Where is the *"bin"* folder where the bash file resides. Right now I use > this one : > /Users/local/projects/ctakes/trunk/ctakes-distribution/src/main/bin > > > Thanks, > Maral > > On Fri, Jul 19, 2019 at 6:13 AM Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > > > Hi Maral, > > > > You can generate different output types by adding different writers to > the > > end of the pipeline. > > Here are the contents of the Default Clinical Pipeline piper file: > > > > > > > ======================================================================================== > > // Commands and parameters to create a default plaintext document > > processing pipeline with UMLS lookup > > > > // Load a simple token processing pipeline from another pipeline file > > load DefaultTokenizerPipeline > > > > // Add non-core annotators > > add ContextDependentTokenizerAnnotator > > addDescription POSTagger > > > > // Add Chunkers > > load ChunkerSubPipe > > > > // Default fast dictionary lookup > > load DictionarySubPipe > > > > // Add Cleartk Entity Attribute annotators > > load AttributeCleartkSubPipe > > > > > ======================================================================================== > > > > > > I recommend that you copy those lines to a new file (for instance, > > Maral.piper) and then add the following lines: > > > > > > > ======================================================================================== > > // Write marked copy of note text in interactive html files > > add pretty.html.HtmlTextWriter SubDirectory=HTML > > > > // Write Fast Health Interoperability Resources (FHIR) json files. > > fhir.org > > package org.apache.ctakes.fhir.cc > > add FhirJsonFileWriter SubDirectory=FHIR > > > > // Write plaintext copy of note text with cui, semantic group, POS. > > Relations are listsed. > > add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT > > > > // Write plaintext copy of note sentences with entity and relation > > disveries listed. > > add property.plaintext.PropertyTextWriterFit SubDirectory=PROP > > > > > ======================================================================================== > > > > > > The output directory should then contain some new output in different > > subdirectories. You can change the subdirectory names. > > > > Note: the "=================================" are just there to indicate > > what is for the file. Do not copy them. > > > > There are many more file writers, most of which write simple lists of > > discoveries in one form or another. > > I recommend trying the 4 above and see if any fit your purposes before > > moving on to more specialized writers. > > > > Sean > > > > ________________________________________ > > From: Maral Amir <maraljav...@gmail.com> > > Sent: Thursday, July 18, 2019 7:11 PM > > To: dev@ctakes.apache.org > > Subject: Re: cTAKES Pipeline [EXTERNAL] > > > > Hi Sean, > > > > Thank you so much for your very helpful and comprehensive response. I was > > able to generate the xmi results in the output directory and used UIMA > Cas > > Visual Debugger (CVD) as suggested to view the information. I have two > > questions: > > 1. What is the best reference for me to study and understand the > > annotations. > > 2. Is there a CLI equivalent to CVD? I need the annotated outputs in a > > readable format without the help of CVD. > > > > Thanks, > > Maral > > > > > > On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean < > > sean.fi...@childrens.harvard.edu> wrote: > > > > > Hi Maral, > > > > > > This might be what you are talking about with respect to the Default > > > Clinical Pipeline > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e= > > > > > > That lists a command line method for running a set of files and getting > > > xml output. > > > > > > The default clinical pipeline configuration is actually contained in > the > > > plain text (piper) file > > > resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper > > > > > > If you are looking at source code then the file is > > > ctakes-clinical-pipeline-res/src/main/resources/ ... > > > > > > You can also select and run a piper file with a gui > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e= > > > > > > Both methods are mentioned near the bottom of one of the pages > detailing > > > pipeline configuration > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e= > > > > > > There are several example pipelines constructed with code and/or plain > > > text files in the ctakes-examples and ctakes-examples-res modules. You > > can > > > look at the different "Hello World" examples. > > > > > > Since you are playing with maven, you can run the profile > "runPiperGui". > > > mvn clean compile -DskipTests -PrunPiperGui > > > > > > Sean > > > > > > > > > ________________________________________ > > > From: Maral Amir <maraljav...@gmail.com> > > > Sent: Thursday, July 18, 2019 2:29 PM > > > To: dev@ctakes.apache.org > > > Subject: cTAKES Pipeline [EXTERNAL] > > > > > > Hi, > > > > > > I just build my developer version of cTAKES with the help of wonderful > > > cTAKES developers. > > > > > > For my next step, I would appreciate if somebody direct me to a right > > path. > > > I am planning to process text clinical documents through the entire > > > pipeline to generate xml output. I see the website suggest walking > > through > > > the Default Clinical Pipeline. I understand there are also multiple git > > > repositories on developed command line tool based Apache cTAKES. > > > My final goal is to integrate cTAKES with some Python packages( OCR, > > etc.) > > > into one pipeline and have some form of web service at the end. I would > > > deeply appreciate any suggestions. > > > > > > Thanks, > > > Maral > > > > > >