I could pull a dozen or so "sets" of labs from my own personal bank of notes that contain various forms of what you would usually call the lab section of a soap note with minimal effort .... I dont mind, might take me a couple of days with work tempo as it is. Its probably all from of two different emr's total though with a handfull of written values in short hand (E.g the classic fishbones used for like bnp and cbc), so not a lot of variability but maybe enough to start compiling regex's with.
If thats helpful and no one else comes along with some free data of a larger sort... Also, there are about 10 notes I commited to the project a year or so ago as examples that may have lab data in them. JG — Sent from Mailbox On Tue, Sep 30, 2014 at 8:25 AM, Ajay Jain <ajayj...@mobileinsights.net> wrote: > John, > I am in the initial stages of my project and I'll take whatever dataset you > are able to provide without spending a lot of effort extracting it. > Thanks. > Ajay > Sent from my iPhone >> On Sep 30, 2014, at 5:22 AM, "John Green" <john.travis.gr...@gmail.com> >> wrote: >> >> How large? And across how many EMRs? >> >> >> JG >> — >> Sent from Mailbox >> >> On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain <ajayj...@mobileinsights.net> >> wrote: >> >>> Sorry, I wasn't clear. I am working on a related project and trying to >>> figure out if the code can be repurposed for a lab mention annotator for >>> cTAKES. From what I have seen, test names from different institutions are >>> not standardized which makes it hard to standardize the resulting >>> annotation. Getting access to a larger lab tests dataset (structured) will >>> help me fine tune the model. >>> >>> Hope this helps. >>> Ajay >>> Sent from my iPhone >>>> On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" >>>> <guergana.sav...@childrens.harvard.edu> wrote: >>>> >>>> Ajay, >>>> cTAKES currently does not implement a method to discover labs from the >>>> text. The motivation is that you can get that easily from the structured >>>> part of the EMR (what Pete explained below). Hope this makes sense! >>>> --Guergana >>>> >>>> -----Original Message----- >>>> From: Peter Szolovits [mailto:p...@mit.edu] >>>> Sent: Monday, September 29, 2014 2:32 PM >>>> To: dev@ctakes.apache.org >>>> Subject: Re: De-identified lab tests dataset >>>> >>>> Ajay, I'm confused by your query. cTakes is good at interpreting text, >>>> but most lab test results are reported in tabular form that is most >>>> appropriately searched by SQL queries. Sometimes lab results are also >>>> reported in narrative notes, but parsing those is often more a matter of >>>> deciphering the text structure of tables than of parsing real English >>>> text. What am I misunderstanding? >>>> >>>> --Pete Sz. >>>> >>>>> On Sep 29, 2014, at 2:25 PM, Ajay Jain <ajayj...@mobileinsights.net> >>>>> wrote: >>>>> >>>>> Hello All, >>>>> >>>>> I am working on a use case for lab tests data using cTAKES and my >>>>> online search to find a test dataset has been futile. I'll greatly >>>>> appreciate if someone can share such a dataset or can point me in the >>>>> right direction to go looking for one. >>>>> >>>>> Best, >>>>> Ajay >>>>> >>>>> -- >>>>> Founder & CEO >>>>> Mobile Insights, Inc. >>>>> (630) 408-8623