cTAKES handling of fractions in dosage

2014-03-19 Thread Ajay Jain
Hi,
I am running the DrugAggregatePlainTextUMLSProcessor and noticing that cTAKES 
is not handling fractions for medication dosage correctly.  For example,
"Benadryl 1/2 tsp 3 times daily"
produces a MedicationMention with a dosage of 2.  Having tried different 
fractions (1/2, 1/3, 2/4, etc.), apparently the dosage is always set to the 
digit on the right of the '/'.  I am wondering if this is a documented issue 
with cTAKES (I am using cTAKES version 3.1).  Thanks in advance.
Jain  

RE: cTAKES handling of fractions in dosage

2014-03-20 Thread Ajay Jain
Pei,
Thanks.  I also wanted to mention that I have noticed the same behavior when 
ranges are specified in the context of dosage and frequency.  For example, when 
I try:
'Take Lipitor 1-2 tabs 3-4 times a day for 7 days', dosage is annotated as '2' 
and frequency value is annotated as '4'.
Looks like the same pattern is surfacing here (use the latter half) as well.
Best,Jain
> From: pei.c...@childrens.harvard.edu
> To: dev@ctakes.apache.org
> Subject: RE: cTAKES handling of fractions in dosage
> Date: Thu, 20 Mar 2014 14:41:32 +
> 
> Jain,
> Jira has been open to track this: 
> https://issues.apache.org/jira/browse/CTAKES-289
> 
> It seems like a possible bug to me, but I only had a quick glance at the code:
> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-drug-ner/src/main/java/org/apache/ctakes/drugner/fsm/machines/elements/DosagesFSM.java
> It does seem to reuse the FractionStrengthCondition, but perhaps others could 
> comment it.
> 
> --Pei
> 
> > -Original Message-
> > From: Ajay Jain [mailto:goti...@hotmail.com]
> > Sent: Wednesday, March 19, 2014 3:48 PM
> > To: dev@ctakes.apache.org
> > Subject: cTAKES handling of fractions in dosage
> > 
> > Hi,
> > I am running the DrugAggregatePlainTextUMLSProcessor and noticing that
> > cTAKES is not handling fractions for medication dosage correctly.  For
> > example, "Benadryl 1/2 tsp 3 times daily"
> > produces a MedicationMention with a dosage of 2.  Having tried different
> > fractions (1/2, 1/3, 2/4, etc.), apparently the dosage is always set to the 
> > digit
> > on the right of the '/'.  I am wondering if this is a documented issue with
> > cTAKES (I am using cTAKES version 3.1).  Thanks in advance.
> > Jain
  

De-identified lab tests dataset

2014-09-29 Thread Ajay Jain
Hello All,

I am working on a use case for lab tests data using cTAKES and my online
search to find a test dataset has been futile.  I'll greatly appreciate if
someone can share such a dataset or can point me in the right direction to
go looking for one.

Best,
Ajay

-- 
Founder & CEO
Mobile Insights, Inc.
(630) 408-8623


Re: De-identified lab tests dataset

2014-09-29 Thread Ajay Jain
Sorry, I wasn't clear. I am working on a related project and trying to figure 
out if the code can be repurposed for a lab mention annotator for cTAKES. From 
what I have seen, test names from different institutions are not standardized 
which makes it hard to standardize the resulting annotation. Getting access to 
a larger lab tests dataset (structured) will help me fine tune the model. 
 
Hope this helps. 

Ajay
Sent from my iPhone

> On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" 
>  wrote:
> 
> Ajay,
> cTAKES currently does not implement a method to discover labs from the text. 
> The motivation is that you can get that easily from the structured part of 
> the EMR (what Pete explained below). Hope this makes sense!
> --Guergana
> 
> -Original Message-
> From: Peter Szolovits [mailto:p...@mit.edu] 
> Sent: Monday, September 29, 2014 2:32 PM
> To: dev@ctakes.apache.org
> Subject: Re: De-identified lab tests dataset
> 
> Ajay, I'm confused by your query.  cTakes is good at interpreting text, but 
> most lab test results are reported in tabular form that is most appropriately 
> searched by SQL queries.  Sometimes lab results are also reported in 
> narrative notes, but parsing those is often more a matter of deciphering the 
> text structure of tables than of parsing real English text.  What am I 
> misunderstanding?
> 
> --Pete Sz.
> 
>> On Sep 29, 2014, at 2:25 PM, Ajay Jain  wrote:
>> 
>> Hello All,
>> 
>> I am working on a use case for lab tests data using cTAKES and my 
>> online search to find a test dataset has been futile.  I'll greatly 
>> appreciate if someone can share such a dataset or can point me in the 
>> right direction to go looking for one.
>> 
>> Best,
>> Ajay
>> 
>> --
>> Founder & CEO
>> Mobile Insights, Inc.
>> (630) 408-8623
> 


Re: De-identified lab tests dataset

2014-09-30 Thread Ajay Jain
John,

I am in the initial stages of my project and I'll take whatever dataset you are 
able to provide without spending a lot of effort extracting it. 

Thanks.
Ajay

Sent from my iPhone

> On Sep 30, 2014, at 5:22 AM, "John Green"  wrote:
> 
> How large? And across how many EMRs? 
> 
> 
> JG
> —
> Sent from Mailbox
> 
> On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain 
> wrote:
> 
>> Sorry, I wasn't clear. I am working on a related project and trying to 
>> figure out if the code can be repurposed for a lab mention annotator for 
>> cTAKES. From what I have seen, test names from different institutions are 
>> not standardized which makes it hard to standardize the resulting 
>> annotation. Getting access to a larger lab tests dataset (structured) will 
>> help me fine tune the model. 
>> 
>> Hope this helps. 
>> Ajay
>> Sent from my iPhone
>>> On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" 
>>>  wrote:
>>> 
>>> Ajay,
>>> cTAKES currently does not implement a method to discover labs from the 
>>> text. The motivation is that you can get that easily from the structured 
>>> part of the EMR (what Pete explained below). Hope this makes sense!
>>> --Guergana
>>> 
>>> -Original Message-
>>> From: Peter Szolovits [mailto:p...@mit.edu] 
>>> Sent: Monday, September 29, 2014 2:32 PM
>>> To: dev@ctakes.apache.org
>>> Subject: Re: De-identified lab tests dataset
>>> 
>>> Ajay, I'm confused by your query.  cTakes is good at interpreting text, but 
>>> most lab test results are reported in tabular form that is most 
>>> appropriately searched by SQL queries.  Sometimes lab results are also 
>>> reported in narrative notes, but parsing those is often more a matter of 
>>> deciphering the text structure of tables than of parsing real English text. 
>>>  What am I misunderstanding?
>>> 
>>> --Pete Sz.
>>> 
>>>> On Sep 29, 2014, at 2:25 PM, Ajay Jain  wrote:
>>>> 
>>>> Hello All,
>>>> 
>>>> I am working on a use case for lab tests data using cTAKES and my 
>>>> online search to find a test dataset has been futile.  I'll greatly 
>>>> appreciate if someone can share such a dataset or can point me in the 
>>>> right direction to go looking for one.
>>>> 
>>>> Best,
>>>> Ajay
>>>> 
>>>> --
>>>> Founder & CEO
>>>> Mobile Insights, Inc.
>>>> (630) 408-8623