Cas to RDF

2014-07-28 Thread Nick Nikandish
Hi there,

I am able to generate an XML file from a CAS using CasToInlineXML() class. I 
was wondering if there was any class or method  in CTakes that we can generate 
RDF files from a CAS?

Thanks,
Nick



RE: Cas to RDF

2014-07-28 Thread Masanz, James J.

If someone else doesn't reply to your email here, since this is more of a 
general UIMA question, your best bet might be searching the UIMA archives
http://uima.markmail.org/

and posting to the UIMA user list if you don't find anything
https://uima.apache.org/mail-lists.html


From: Nick Nikandish [snika...@emerginghealthit.com]
Sent: Monday, July 28, 2014 9:44 AM
To: dev@ctakes.apache.org
Subject: Cas to RDF

Hi there,

I am able to generate an XML file from a CAS using CasToInlineXML() class. I 
was wondering if there was any class or method  in CTakes that we can generate 
RDF files from a CAS?

Thanks,
Nick

RE: Cas to RDF

2014-07-28 Thread Nick Nikandish
Thanks James.

-Original Message-
From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] 
Sent: Monday, July 28, 2014 10:52 AM
To: dev@ctakes.apache.org
Subject: RE: Cas to RDF


If someone else doesn't reply to your email here, since this is more of a 
general UIMA question, your best bet might be searching the UIMA archives 
http://uima.markmail.org/

and posting to the UIMA user list if you don't find anything 
https://uima.apache.org/mail-lists.html


From: Nick Nikandish [snika...@emerginghealthit.com]
Sent: Monday, July 28, 2014 9:44 AM
To: dev@ctakes.apache.org
Subject: Cas to RDF

Hi there,

I am able to generate an XML file from a CAS using CasToInlineXML() class. I 
was wondering if there was any class or method  in CTakes that we can generate 
RDF files from a CAS?

Thanks,
Nick


Re: ytex examples

2014-07-28 Thread Clayton Turner
Hi again:

This issue is still persisting. I can only find the SparseDataExporterImpl
class outside of the ctakes snapshot and, even using that version, I cannot
get the exporter to work, regardless of the parameters I pass to it.


On Thu, Jul 24, 2014 at 3:18 PM, Clayton Turner 
wrote:

> Hm, okay, so I've bumped into a slightly different issue.
>
> I've got my export.xml built and put into a directory, but I still can't
> seem to find the sparsedataexportimpl class. My guess is that it needs to
> be added to the java path via the setenv.bat file, but the only reference
> to a built version of this file that I can find is outside of the ctakes
> snapshot that I compiled with maven (so not within the ctakes home
> directory structure).
>
> Should I point there with my java path or is there some other location I'm
> just missing which contains these libraries (maybe they're in a jar that
> I'm somehow not including?)
>
>
> On Thu, Jul 24, 2014 at 2:24 PM, Clayton Turner 
> wrote:
>
>> Hi,
>>
>> Alright, I planned on using weka, but it might not be a bad idea to just
>> jump in with either R or Python.
>>
>> I'll check out that link.
>>
>> Thanks!
>>
>>
>> On Thu, Jul 24, 2014 at 2:11 PM, vijay garla  wrote:
>>
>>> Hi Clayton,
>>>
>>> Haven't gotten around to upgrading the docs
>>> look here for examples:
>>>
>>> https://code.google.com/p/ytex/source/browse/#svn%2Ftrunk%2Fworkspace%2Fexamples%2Ffracture
>>>
>>> If you are using R/Matlab/Python it is easy to generate a sparse matrix
>>> directly via database queries, I can give you a few examples
>>>
>>> Best,
>>>
>>> VJ
>>>
>>>
>>> On Thu, Jul 24, 2014 at 8:02 PM, Clayton Turner 
>>> wrote:
>>>
>>> > I've been following the usage component guide for ctakes 3.2 and ytex,
>>> but
>>> > I'm having an issue.
>>> >
>>> > I get to the point where I want to export my data as a bag of words (or
>>> > cuis), but the documentation on the wiki seems to be really out of date
>>> > when it comes to the exporting for data mining step.
>>> >
>>> > The YTEX home directory doesn't seem to actually be a thing and
>>> there's no
>>> > fracture example directory with a cui/word folder for the examples
>>> anymore.
>>> >
>>> > Is there an updated version of this documentation in the works or can
>>> > someone just give me pointers on how to execute the command over the
>>> > command prompt (Windows)?
>>> >
>>> > Thank you,
>>> > Clayton
>>> >
>>>
>>
>>
>>
>> --
>> --
>> Clayton Turner
>> email: caturn...@g.cofc.edu
>> phone: (843)-424-3784
>> web: claytonturner.blogspot.com
>>
>> -
>> “When scientifically investigating the natural world, the only thing
>> worse than a blind believer is a seeing denier.”
>> - Neil deGrasse Tyson
>>
>
>
>
> --
> --
> Clayton Turner
> email: caturn...@g.cofc.edu
> phone: (843)-424-3784
> web: claytonturner.blogspot.com
>
> -
> “When scientifically investigating the natural world, the only thing worse
> than a blind believer is a seeing denier.”
> - Neil deGrasse Tyson
>



-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


Re: question about sentence segmentation

2014-07-28 Thread britt fitch
Thanks for the document, Tim. It seems to not be explicit about how to handle 
sentences occurring in lists. 

Are you still considering having the list number as outside of the sentence? 

Thanks

Britt

On Jul 25, 2014, at 7:09 AM, Miller, Timothy 
 wrote:

> Checking with Guergana and other colleagues here the advice is to have the 
> sentence segmenter follow the treebank guidelines for sentence segmentation:
> http://clear.colorado.edu/compsem/documents/treebank_guidelines.pdf
> 
> They are a bit light on detail but fortunately we have some treebanked data 
> so I will use that for the training data and hopefully that will illuminate 
> the tricky cases.
> 
> Tim
> 
> 
> From: Masanz, James J. [masanz.ja...@mayo.edu]
> Sent: Tuesday, July 15, 2014 4:39 PM
> To: 'dev@ctakes.apache.org'
> Subject: RE: question about sentence segmentation
> 
> Sorry, I don't know if there was a reason.
> 
> If you haven't checked with Guergana, you might want to ask her if she had a 
> reason or if it was just the way it had been since that corpus was created.
> 
> -Original Message-
> From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
> Sent: Tuesday, July 15, 2014 3:34 PM
> To: dev@ctakes.apache.org
> Subject: Re: question about sentence segmentation
> 
> Thanks James, I was hoping to hear from you. I'll probably go ahead and
> change the data to split sentences between the list header and list element.
> 
> You don't happen to know if there is any principled reason for the
> original style or whether it was just an arbitrary convention? The only
> thing I can think of is it might be hard to learn when to separate when
> there is no period after the list header (as in your examples). I think
> it's worth empirically checking on that point, but there might be other
> reasons that I'm not thinking of.
> 
> Thanks
> Tim
> 
> On 07/15/2014 03:27 PM, Masanz, James J. wrote:
>> I don't have an opinion about how it should work.
>> 
>> But I can verify that the clinical notes from Mayo Clinic that were used in 
>> the initial cTAKES sentence detector model had the list markers included in 
>> the first sentence, so, for example, the following would be two sentences, 
>> with each line a separate sentence.
>> 
>> #1 Dilated esophagus.
>> #2 Adenocarcinoma
>> 
>> -- James
>> 
>> -Original Message-
>> From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
>> Sent: Tuesday, July 15, 2014 6:04 AM
>> To: dev@ctakes.apache.org
>> Subject: RE: question about sentence segmentation
>> 
>>> My preference is to treat the list row number as outside of the sentence of
>> interest. Or if it is necessary to be included in a sentence, have it be a 
>> sentence
>> on its own.
>> 
>> I can get behind this, I think it makes the issue a bit cleaner, to either 
>> have the list header as non-sentential or it's own sentence. As far as I can 
>> tell, this is not the current default behavior. At least in my runs the list 
>> header seems to get attached to the first following sentence, even in cases 
>> where it starts with a digit and a period ("3. Magnesium oxide 400 mg p.o. 
>> daily." is all one sentence).
>> This behavior is probably strongly dependent on the annotations we give the 
>> sentence detector so as I'm prepping new training data I should have a 
>> default in mind.
>> 
>> Does anyone have any objections to changing the sentence detector behavior 
>> to break list headers (things like "3." or "A " or "#5") as their own 
>> sentence?
>> 
>> Tim
>> 
>> 
>> 
>> From: Britt Fitch [britt.fi...@gmail.com]
>> Sent: Monday, July 14, 2014 8:29 AM
>> To: dev@ctakes.apache.org
>> Subject: Re: question about sentence segmentation
>> 
>> My preference is to treat the list row number as outside of the sentence of
>> interest.
>> Or if it is necessary to be included in a sentence, have it be a sentence
>> on its own.
>> That won't be as straightforward as splitting on a period in cases
>> like "2. Magnesium
>> oxide 400 mg p.o. daily."
>> In cases where there are more than 1 written sentence like your example in
>> the original email, I'd prefer those were each a sentence rather than
>> making the entire list line a single sentence.
>> My feeling is that each line without terminating punctuation would be a
>> single sentence and would exclude the list number.
>> 
>> As an aside, I have encountered several issues with numbered lists being
>> interpreted differently depending on
>> 1. what number is included at the start
>> for example: "2. Magnesium oxide 400 mg p.o. daily." vs "12. Magnesium
>> oxide 400 mg p.o. daily." (This appears to be a chunking issue where the
>> line starting with "12. Magnesium" is identified as starting with chunks [O,
>> O, B-NP, B-NP, I-NP, B-NP, B-ADVP, O] even though the parts of speech
>> appear to be correct)
>> 2. whether there is a period at the end of a list
>> for example: "4. CHF" vs "4. C

Re: question about sentence segmentation

2014-07-28 Thread Miller, Timothy
Yes, you're right about that Britt. I've been doing some annotations side by 
side with a treebank viewer and think I have a pretty good handle on the actual 
rules.

Basically, if a header or list identifier is followed by a period or a newline 
it is considered a sentence break and otherwise it is part of the sentence.

e.g.

1. 20 mg flomax

is two sentences, while:

1 - 20 mg flomax

is one sentence.

For headings:

Allergies: Pt is allergic to aspirin.

is one sentence, while:

Allergies:
Pt is allergic to aspirin.

is two sentences.

I'm planning to follow these guidelines.

Tim

On 07/28/2014 01:53 PM, britt fitch wrote:

Thanks for the document, Tim. It seems to not be explicit about how to handle 
sentences occurring in lists.

Are you still considering having the list number as outside of the sentence?

Thanks

Britt

On Jul 25, 2014, at 7:09 AM, Miller, Timothy 

 wrote:



Checking with Guergana and other colleagues here the advice is to have the 
sentence segmenter follow the treebank guidelines for sentence segmentation:
http://clear.colorado.edu/compsem/documents/treebank_guidelines.pdf

They are a bit light on detail but fortunately we have some treebanked data so 
I will use that for the training data and hopefully that will illuminate the 
tricky cases.

Tim


From: Masanz, James J. [masanz.ja...@mayo.edu]
Sent: Tuesday, July 15, 2014 4:39 PM
To: 'dev@ctakes.apache.org'
Subject: RE: question about sentence segmentation

Sorry, I don't know if there was a reason.

If you haven't checked with Guergana, you might want to ask her if she had a 
reason or if it was just the way it had been since that corpus was created.

-Original Message-
From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
Sent: Tuesday, July 15, 2014 3:34 PM
To: dev@ctakes.apache.org
Subject: Re: question about sentence segmentation

Thanks James, I was hoping to hear from you. I'll probably go ahead and
change the data to split sentences between the list header and list element.

You don't happen to know if there is any principled reason for the
original style or whether it was just an arbitrary convention? The only
thing I can think of is it might be hard to learn when to separate when
there is no period after the list header (as in your examples). I think
it's worth empirically checking on that point, but there might be other
reasons that I'm not thinking of.

Thanks
Tim

On 07/15/2014 03:27 PM, Masanz, James J. wrote:


I don't have an opinion about how it should work.

But I can verify that the clinical notes from Mayo Clinic that were used in the 
initial cTAKES sentence detector model had the list markers included in the 
first sentence, so, for example, the following would be two sentences, with 
each line a separate sentence.

#1 Dilated esophagus.
#2 Adenocarcinoma

-- James

-Original Message-
From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
Sent: Tuesday, July 15, 2014 6:04 AM
To: dev@ctakes.apache.org
Subject: RE: question about sentence segmentation



My preference is to treat the list row number as outside of the sentence of


interest. Or if it is necessary to be included in a sentence, have it be a 
sentence
on its own.

I can get behind this, I think it makes the issue a bit cleaner, to either have 
the list header as non-sentential or it's own sentence. As far as I can tell, 
this is not the current default behavior. At least in my runs the list header 
seems to get attached to the first following sentence, even in cases where it 
starts with a digit and a period ("3. Magnesium oxide 400 mg p.o. daily." is 
all one sentence).
This behavior is probably strongly dependent on the annotations we give the 
sentence detector so as I'm prepping new training data I should have a default 
in mind.

Does anyone have any objections to changing the sentence detector behavior to 
break list headers (things like "3." or "A " or "#5") as their own sentence?

Tim



From: Britt Fitch [britt.fi...@gmail.com]
Sent: Monday, July 14, 2014 8:29 AM
To: dev@ctakes.apache.org
Subject: Re: question about sentence segmentation

My preference is to treat the list row number as outside of the sentence of
interest.
Or if it is necessary to be included in a sentence, have it be a sentence
on its own.
That won't be as straightforward as splitting on a period in cases
like "2. Magnesium
oxide 400 mg p.o. daily."
In cases where there are more than 1 written sentence like your example in
the original email, I'd prefer those were each a sentence rather than
making the entire list line a single sentence.
My feeling is that each line without terminating punctuation would be a
single sentence a

Null Pointer

2014-07-28 Thread John Green
Any ideas why as of cTakes 3.2.0 when I try and
use FilesInDirectoryCollectionReader.xml I get a NullPointerException?

java.lang.NullPointerException
at org.apache.uima.tools.cpm.CpmPanel.fileSelected(CpmPanel.java:1509)
at
org.apache.uima.tools.util.gui.FileSelector$1.actionPerformed(FileSelector.java:141)
at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)
at
javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2341)
at
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
at
javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:252)
at java.awt.Component.processMouseEvent(Component.java:6505)
at javax.swing.JComponent.processMouseEvent(JComponent.java:3311)
at java.awt.Component.processEvent(Component.java:6270)
at java.awt.Container.processEvent(Container.java:2229)
at java.awt.Component.dispatchEventImpl(Component.java:4861)
at java.awt.Container.dispatchEventImpl(Container.java:2287)
at java.awt.Component.dispatchEvent(Component.java:4687)
at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)
at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4422)
at java.awt.Container.dispatchEventImpl(Container.java:2273)
at java.awt.Window.dispatchEventImpl(Window.java:2719)
at java.awt.Component.dispatchEvent(Component.java:4687)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)
at java.awt.EventQueue.access$200(EventQueue.java:103)
at java.awt.EventQueue$3.run(EventQueue.java:694)
at java.awt.EventQueue$3.run(EventQueue.java:692)
at java.security.AccessController.doPrivileged(Native Method)
at
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
at
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:87)
at java.awt.EventQueue$4.run(EventQueue.java:708)
at java.awt.EventQueue$4.run(EventQueue.java:706)
at java.security.AccessController.doPrivileged(Native Method)
at
java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:705)
at
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
at
java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
at
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)


Re: Null Pointer

2014-07-28 Thread John Green
Please disregard, I discovered the error.

JG


On Mon, Jul 28, 2014 at 3:29 PM, John Green 
wrote:

> Any ideas why as of cTakes 3.2.0 when I try and
> use FilesInDirectoryCollectionReader.xml I get a NullPointerException?
>
> java.lang.NullPointerException
> at org.apache.uima.tools.cpm.CpmPanel.fileSelected(CpmPanel.java:1509)
>  at
> org.apache.uima.tools.util.gui.FileSelector$1.actionPerformed(FileSelector.java:141)
> at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)
>  at
> javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2341)
> at
> javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
>  at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
> at
> javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:252)
>  at java.awt.Component.processMouseEvent(Component.java:6505)
> at javax.swing.JComponent.processMouseEvent(JComponent.java:3311)
>  at java.awt.Component.processEvent(Component.java:6270)
> at java.awt.Container.processEvent(Container.java:2229)
>  at java.awt.Component.dispatchEventImpl(Component.java:4861)
> at java.awt.Container.dispatchEventImpl(Container.java:2287)
>  at java.awt.Component.dispatchEvent(Component.java:4687)
> at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)
>  at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)
> at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4422)
>  at java.awt.Container.dispatchEventImpl(Container.java:2273)
> at java.awt.Window.dispatchEventImpl(Window.java:2719)
>  at java.awt.Component.dispatchEvent(Component.java:4687)
> at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)
>  at java.awt.EventQueue.access$200(EventQueue.java:103)
> at java.awt.EventQueue$3.run(EventQueue.java:694)
>  at java.awt.EventQueue$3.run(EventQueue.java:692)
> at java.security.AccessController.doPrivileged(Native Method)
>  at
> java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
> at
> java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:87)
>  at java.awt.EventQueue$4.run(EventQueue.java:708)
> at java.awt.EventQueue$4.run(EventQueue.java:706)
> at java.security.AccessController.doPrivileged(Native Method)
>  at
> java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
> at java.awt.EventQueue.dispatchEvent(EventQueue.java:705)
>  at
> java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
> at
> java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
>  at
> java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
> at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
>  at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
> at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)
>
>