Problem with using OrangeBook index

2013-09-03 Thread Arohi Kumar
Hello everyone, I am trying to run ClinicalPipelineWithUMLS.java (main file in the ctakes-clinical-pipeline) using Eclipse. I keep running into java.io.EOFException: read past EOF: RAMInputStream(name=segments) when I am opening the index OrangeBook. It occurs at the line iv_logger.info("Loadin

RE: [VOTE] Release Apache cTAKES 3.1 (rc3)

2013-09-03 Thread Masanz, James J.
I assume people who want the source release rather than trunk know they want the source release for the various policy reasons people want a source *release*. And if they don't want the binary release, and are savvy enough to use eclipse or an IDE of their own choice, then trunk seems reasonab

[ANNOUNCE] Welcome John Green as new cTAKES committer and PMC member

2013-09-03 Thread Pei Chen
The Apache cTAKES Team is happy to announce the addition of John Green as new committer. In his own words, 'I'm a third year medical student rounding the bend toward my MD. I used to be a computer programmer, however, and continue my own projects. Im very interested in contributing to cTAKES devel

Re: [ANNOUNCE] Welcome John Green as new cTAKES committer and PMC member

2013-09-03 Thread Mattmann, Chris A (398J)
Welcome John!! ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++

Re: RTF Annotator?

2013-09-03 Thread Pei Chen
Hi David, There is work being done on Tika/OCR integration, but I am not aware of any cTAKES RTF Annotators. What does others think? Having additional meta data such does sound very interesting especially with mark-ups (bold/italics) and semi-structured data such as tables... --Pei On Sun, Sep 1

RE: dev guide (and all of cwiki.apache.org) seems to be down

2013-09-03 Thread Masanz, James J.
it seems up to me. I followed your link. > -Original Message- > From: dev-return-1939-Masanz.James=mayo@ctakes.apache.org [mailto:dev- > return-1939-Masanz.James=mayo@ctakes.apache.org] On Behalf Of Coarr, > Matt > Sent: Tuesday, September 03, 2013 10:51 AM > To: dev@ctakes.apac

Re: dev guide (and all of cwiki.apache.org) seems to be down

2013-09-03 Thread Pei Chen
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guide seems to be up for me... On Tue, Sep 3, 2013 at 11:53 AM, Masanz, James J. wrote: > it seems up to me. > I followed your link. > > > > > > -Original Message- > > From: dev-return-1939-Masanz.James=mayo.

Re: [VOTE] Release Apache cTAKES 3.1 (rc3)

2013-09-03 Thread Pei Chen
James, +1 [x] +1 Release the packages as Apache cTAKES 3.1.0 Below is what I verified so far (plan to go through the documentation in parallel): Tested: 1- Downloaded src and bin tz 2- Verified Signatures 3- Unpacked and able to compile from source (mvn compile automatically downloads umls resour

Re: [ANNOUNCE] Welcome John Green as new cTAKES committer and PMC member

2013-09-03 Thread Karthik Sarma
Welcome! Good to have another medical student around :) -- Karthik Sarma UCLA Medical Scientist Training Program Class of 20?? Member, UCLA Medical Imaging & Informatics Lab Member, CA Delegation to the House of Delegates of the American Medical Association ksa...@ksarma.com gchat: ksa...@gmai

dev guide (and all of cwiki.apache.org) seems to be down

2013-09-03 Thread Coarr, Matt
I was pointing a coworker at the developer guide, but the cwiki server seems to be down. I didn't see any other comments about this on the mailing list. Do we need to notify someone in infrastructure? How do we do that? I also don't see cwiki mentioned anywhere on the infrastructure status pag

Re: dev guide (and all of cwiki.apache.org) seems to be down

2013-09-03 Thread Coarr, Matt
Ok, it's working again for me too. Whatever it was, it seems to have passed. Thanks, Matt From: Pei Chen mailto:chen...@apache.org>> Reply-To: "dev@ctakes.apache.org" mailto:dev@ctakes.apache.org>> Date: Tuesday, September 3, 2013 11:55 To: "dev@ctakes.apache.org<

RE: Problem with using OrangeBook index

2013-09-03 Thread Masanz, James J.
Which version of cTAKES are you running (have you checked it out from trunk or something else)? There should be log messages stating in which directory (full path) cTAKES has found your copy of the Orange Book. Please compare the contents of that directory on your machine with https://svn.ap

RE: RTF Annotator?

2013-09-03 Thread Masanz, James J.
I think text formatting is a natural for being turned into annotations. Just one example - some people use formatting to indicate section headings and there could be a sectionizer that uses rtf tags as-is to determine sections, or uses them as features at least. -- James > -Original Messag

Re: RTF Annotator?

2013-09-03 Thread Karthik Sarma
I think such a tool would be quite useful -- I imagine that David isn't the only person who works with RTF docs, and avoiding conversion should help us glean additional information as James suggests. Let me know if you need my assistance with anything! -- Karthik Sarma UCLA Medical Scientist

[DRAFT] [REPORT] Apache cTAKES Sept 2013

2013-09-03 Thread Pei Chen
[DRAFT] Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) is a natural language processing (NLP) tool for information extraction from electronic medical record clinical free-text. Issues: There are no issues requiring board attention at this time. Releases: Last release was c

RE: Information Regarding Apache cTAKES-3.0

2013-09-03 Thread Chen, Pei
[+dev] Hi Arohi, I'm glad that you have it working. To get started, I think a good place to get started would be to take a look at the current type system[1] which outlines the output[2] that cTAKES currently supports. As you already found, the IdentifiedAnnotation (and it's subsclasses such as

specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Assur, Ted
I'm trying to understand what would prevent the AggregatePlaintextUMLSProcessor AE from correctly parsing specific problems that are defined in the UMLS version used by cTAKES. For example, CIN (Cervical Intraepithelial Neoplasia) in its general usage is parsed out as UMLS CUI C0206708. CIN co

Re: [VOTE] Release Apache cTAKES 3.1 (rc3)

2013-09-03 Thread Britt Fitch
+1 here as well. On Tue, Sep 3, 2013 at 11:47 AM, Pei Chen wrote: > James, > +1 > [x] +1 Release the packages as Apache cTAKES 3.1.0 > > Below is what I verified so far (plan to go through the documentation in > parallel): > > Tested: > 1- Downloaded src and bin tz > 2- Verified Signatures > 3-

Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Pei Chen
Hi Ted, Detecting the stage/grade and other attributes and asserting those relationships to the cancer aside (That's probably a separate discussion)- But in your example, since there are distinct SNOMEDCT concepts and direct matches, it was able to identify "Cervical intraepithelial neoplasia grad

Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Miller, Timothy
That is a good question, Ted! I tried it with a simple context: "The patient has a CIN III." I'm not sure if that is a correct context but I was able to duplicate your findings. (Finds a CUI for CIN III but not if you change it to CIN II) My first thought was that it is the chunker. But the chunk

Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Pei Chen
It has the correct parse (POS, chunks, and lookupwindow)- but some of the terms do not exist in SNOMED- CIN 2 - Cervical intraepithelial neoplasia 2 [A3002688/SNOMEDCT/SY/285838002] exists but not CIN II. CIN III [A965/SNOMEDCT/SY/20365006] also exists that's why it was able to perform the look

Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Miller, Timothy
Ah. So it will get CIN 2 (in SNOMED) CIN III (in SNOMED) CIN 3 (in SNOMED) but the rest are not in SNOMED? I wonder why it doesn't get CIN I? It looks like that exists in SNOMED (though I don't fully understand what all the symbols mean in the umls browser). > CIN I - Cervical intraepithelial ne

Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-03 Thread Pei Chen
You're right, it should have gotten "CIN I"- that's a strange one, probably needs to be debugged/looked into further... On Tue, Sep 3, 2013 at 10:05 PM, Miller, Timothy wrote: > Ah. So it will get > CIN 2 (in SNOMED) > CIN III (in SNOMED) > CIN 3 (in SNOMED) > > but the rest are not in SNOMED? >