Re: De-identified lab tests dataset
How large? And across how many EMRs? JG — Sent from Mailbox On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain wrote: > Sorry, I wasn't clear. I am working on a related project and trying to figure > out if the code can be repurposed for a lab mention annotator for cTAKES. > From what I have seen, test names from different institutions are not > standardized which makes it hard to standardize the resulting annotation. > Getting access to a larger lab tests dataset (structured) will help me fine > tune the model. > > Hope this helps. > Ajay > Sent from my iPhone >> On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" >> wrote: >> >> Ajay, >> cTAKES currently does not implement a method to discover labs from the text. >> The motivation is that you can get that easily from the structured part of >> the EMR (what Pete explained below). Hope this makes sense! >> --Guergana >> >> -Original Message- >> From: Peter Szolovits [mailto:p...@mit.edu] >> Sent: Monday, September 29, 2014 2:32 PM >> To: dev@ctakes.apache.org >> Subject: Re: De-identified lab tests dataset >> >> Ajay, I'm confused by your query. cTakes is good at interpreting text, but >> most lab test results are reported in tabular form that is most >> appropriately searched by SQL queries. Sometimes lab results are also >> reported in narrative notes, but parsing those is often more a matter of >> deciphering the text structure of tables than of parsing real English text. >> What am I misunderstanding? >> >> --Pete Sz. >> >>> On Sep 29, 2014, at 2:25 PM, Ajay Jain wrote: >>> >>> Hello All, >>> >>> I am working on a use case for lab tests data using cTAKES and my >>> online search to find a test dataset has been futile. I'll greatly >>> appreciate if someone can share such a dataset or can point me in the >>> right direction to go looking for one. >>> >>> Best, >>> Ajay >>> >>> -- >>> Founder & CEO >>> Mobile Insights, Inc. >>> (630) 408-8623 >>
Re: De-identified lab tests dataset
John, I am in the initial stages of my project and I'll take whatever dataset you are able to provide without spending a lot of effort extracting it. Thanks. Ajay Sent from my iPhone > On Sep 30, 2014, at 5:22 AM, "John Green" wrote: > > How large? And across how many EMRs? > > > JG > — > Sent from Mailbox > > On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain > wrote: > >> Sorry, I wasn't clear. I am working on a related project and trying to >> figure out if the code can be repurposed for a lab mention annotator for >> cTAKES. From what I have seen, test names from different institutions are >> not standardized which makes it hard to standardize the resulting >> annotation. Getting access to a larger lab tests dataset (structured) will >> help me fine tune the model. >> >> Hope this helps. >> Ajay >> Sent from my iPhone >>> On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" >>> wrote: >>> >>> Ajay, >>> cTAKES currently does not implement a method to discover labs from the >>> text. The motivation is that you can get that easily from the structured >>> part of the EMR (what Pete explained below). Hope this makes sense! >>> --Guergana >>> >>> -Original Message- >>> From: Peter Szolovits [mailto:p...@mit.edu] >>> Sent: Monday, September 29, 2014 2:32 PM >>> To: dev@ctakes.apache.org >>> Subject: Re: De-identified lab tests dataset >>> >>> Ajay, I'm confused by your query. cTakes is good at interpreting text, but >>> most lab test results are reported in tabular form that is most >>> appropriately searched by SQL queries. Sometimes lab results are also >>> reported in narrative notes, but parsing those is often more a matter of >>> deciphering the text structure of tables than of parsing real English text. >>> What am I misunderstanding? >>> >>> --Pete Sz. >>> On Sep 29, 2014, at 2:25 PM, Ajay Jain wrote: Hello All, I am working on a use case for lab tests data using cTAKES and my online search to find a test dataset has been futile. I'll greatly appreciate if someone can share such a dataset or can point me in the right direction to go looking for one. Best, Ajay -- Founder & CEO Mobile Insights, Inc. (630) 408-8623
Re: De-identified lab tests dataset
I could pull a dozen or so "sets" of labs from my own personal bank of notes that contain various forms of what you would usually call the lab section of a soap note with minimal effort I dont mind, might take me a couple of days with work tempo as it is. Its probably all from of two different emr's total though with a handfull of written values in short hand (E.g the classic fishbones used for like bnp and cbc), so not a lot of variability but maybe enough to start compiling regex's with. If thats helpful and no one else comes along with some free data of a larger sort... Also, there are about 10 notes I commited to the project a year or so ago as examples that may have lab data in them. JG — Sent from Mailbox On Tue, Sep 30, 2014 at 8:25 AM, Ajay Jain wrote: > John, > I am in the initial stages of my project and I'll take whatever dataset you > are able to provide without spending a lot of effort extracting it. > Thanks. > Ajay > Sent from my iPhone >> On Sep 30, 2014, at 5:22 AM, "John Green" >> wrote: >> >> How large? And across how many EMRs? >> >> >> JG >> — >> Sent from Mailbox >> >> On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain >> wrote: >> >>> Sorry, I wasn't clear. I am working on a related project and trying to >>> figure out if the code can be repurposed for a lab mention annotator for >>> cTAKES. From what I have seen, test names from different institutions are >>> not standardized which makes it hard to standardize the resulting >>> annotation. Getting access to a larger lab tests dataset (structured) will >>> help me fine tune the model. >>> >>> Hope this helps. >>> Ajay >>> Sent from my iPhone On Sep 29, 2014, at 2:12 PM, "Savova, Guergana" wrote: Ajay, cTAKES currently does not implement a method to discover labs from the text. The motivation is that you can get that easily from the structured part of the EMR (what Pete explained below). Hope this makes sense! --Guergana -Original Message- From: Peter Szolovits [mailto:p...@mit.edu] Sent: Monday, September 29, 2014 2:32 PM To: dev@ctakes.apache.org Subject: Re: De-identified lab tests dataset Ajay, I'm confused by your query. cTakes is good at interpreting text, but most lab test results are reported in tabular form that is most appropriately searched by SQL queries. Sometimes lab results are also reported in narrative notes, but parsing those is often more a matter of deciphering the text structure of tables than of parsing real English text. What am I misunderstanding? --Pete Sz. > On Sep 29, 2014, at 2:25 PM, Ajay Jain > wrote: > > Hello All, > > I am working on a use case for lab tests data using cTAKES and my > online search to find a test dataset has been futile. I'll greatly > appreciate if someone can share such a dataset or can point me in the > right direction to go looking for one. > > Best, > Ajay > > -- > Founder & CEO > Mobile Insights, Inc. > (630) 408-8623