Brandon,
That sounds great!
Please open a Jira ticket for any contributions (anyone should be able
to create a Jira account).  There are some legal items built into the
ASF Jira attachments for accepting contributions/donations.
It will also credit the contributors with the merit appropriately.
Anyone who is interested can follow the Jira item. (Even better if
contributions were open discussion/open development.)
--Pei

On Tue, Dec 8, 2015 at 10:36 PM, Geise, Brandon D.
<bdge...@geisinger.edu> wrote:
> I'd be interested in contributing to making the dictionary tool more user 
> friendly with a GUI.
>
> Thanks,
> Brandon
>
> -----Original Message-----
> From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
> Sent: Tuesday, December 08, 2015 6:12 PM
> To: dev@ctakes.apache.org
> Subject: RE: ctakes with icd10; 2015 versions available on sourceforge!
>
> Hi Dave,
>
> I'm always happy to see interest in our stuff!
>
>>Step 1
> I built the tool to be able to build a dictionary using anything in the umls 
> - snomed, icd9, hpo, etc. so using the veterinary extension shouldn't be a 
> problem.  You just add it to the CtakesSources file (or create an alternate 
> file and point to it with -src).  To answer another of your questions, there 
> can be zero or more sources - you saw snomedct and snomedct_us (each valid in 
> a different umls version).
> It also can include any semantic type, just add (or remove) the appropriate 
> tuis in a different data file.
>
>>Step 2
> You have it right - you copy the templates to another location and output to 
> that location.  Otherwise you 'lose' your templates.
>
>>Step 3 and 4
> The jar is built from source.  I need to (soon) check in updates to the 
> source, and at the same time I can check in a default prebuilt .jar  The lib/ 
> directory is in the source repository.
>
> Various people have toyed with the idea of putting the tool into a ctakes 
> module, putting it into an "installation package", making a gui ...  The best 
> option (imo) is probably to make an easy to use gui and keep a pre-built 
> version in sandbox.  Someday, after the rainbow, maybe I'll get a chance to 
> do that ...
>
> Sean
>
>
> -----Original Message-----
> From: David Kincaid [mailto:kincaid.d...@gmail.com]
> Sent: Tuesday, December 08, 2015 4:57 PM
> To: dev@ctakes.apache.org
> Subject: Re: ctakes with icd10; 2015 versions available on sourceforge!
>
> Thanks, Sean! It's great that cTAKES may soon have an up to date database out 
> of the box. Hopefully it will cut down on the need for many to build their 
> own DB's. Thank you much for doing that.
>
> Unfortunately, I still will need to build a custom one for us. I work in 
> veterinary medicine so I need to add in the veterinary extension for 
> SNOMED-CT into the database.
>
> I looked over the steps below that Brandon included and have some questions:
>
> step 1 says to "Change /data/default/CtakesSources.txt from "SNOMEDCT" to 
> "SNOMEDCT_US". The file that I have has two lines in it. First line is 
> SNOMED, second line is SNOMEDCT_US. So this step doesn't really make sense.
>
> step 2 should reference the two scripts as being in resource/memdbtemplate so 
> others don't have to search for them. Not sure what it means to move them to 
> "location to put new UMLS DB". Does that mean move them into a new directory 
> where the newly created UMLS DB will get written?
>
> steps 3 and 4 for running the tools reference dictionarytool.jar which 
> doesn't exist. Does one need to build that somehow from the source before 
> running it? The command line also adds "lib/*" to the classpath. Is that the 
> lib directory inside the dictionarytool source code or some other location?
>
> What else would I need to do to include the SNOMED-CT Veterinary Extension 
> along with the snomedct and rxnorm sources?
>
> I'll probably not have time to try this out for a while yet, but when I do 
> I'd be happy to write up an easy to follow tutorial for building a custom 
> dictionary assuming I am able to get it to work.
>
> Has anyone considered making this tool available outside of the source code 
> itself? Like including it in the main cTAKES release? It seems there is 
> demand for it.
>
> - Dave
>
> On Tue, Dec 8, 2015 at 3:22 PM, Finan, Sean < 
> sean.fi...@childrens.harvard.edu> wrote:
>
>> Hi Brandon, thanks for finding and forwarding the instructions!
>>
>> I have checked in two new hsqldb dictionaries, both from the 2015AB
>> version of the UMLS.  They both have codes for snomedct_us, rxnorm,
>> icd9cm and icd10pcs - as well as the usual cui, tui, preferred term mappings.
>>
>> One uses cuis filtered by snomed and rxnorm, the other adds cuis
>> filtered by icd9 and icd10.
>> What this means:  Cuis that exist for a [filter source] are added to
>> the dictionary, as are all text variations from all sources that
>> contain that cui.  Both dictionaries also use the standard ctakes
>> semantic group tui filters.
>>
>> The names are ctakessnorx2015 and ctakesicd2015
>>
>> The snomed rxnorm :
>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p_
>> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drwo
>> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_l
>> ookup_fast_ctakessnorx2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZM
>> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3Fm
>> uUXq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=kWCcj3-hcqYWZXIPhsERggDLCO-5gppCR
>> oS1Gav7r2A&e=
>>
>> The snomed rxnorm icd9 icd10:
>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p_
>> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2Drwo
>> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictionary_l
>> ookup_fast_ctakesicd2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSd
>> ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3FmuU
>> Xq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=RZ--ZQ2qvGnhm4h2Vvz1oU97qA8BG2G39Tw
>> w7EdYgKA&e=
>>
>> The svn root for the whole ugly thing is:
>>  svn checkout svn://svn.code.sf.net/p/ctakesresources/code/trunk
>>
>> Stats:
>> ctakessnorx2015
>> 545,913 Terms
>> 229,251 Concepts (Cuis)
>> 272,987 Snomed codes
>> 32,419 Rxnorm codes
>> 11,321 icd9 codes
>> 61 icd10 codes
>>
>> Ctakesicd2015
>> 611,230 Terms
>> 282,211 Concepts
>> 18,626 icd9 codes
>> 45,818 icd10 codes
>> Snomed and Rxnorm counts are the same
>>
>> So, adding the icd filters gave us an extra ~53,000 concepts and
>> ~65,000 terms.
>>
>> I would like to move this all to a better root (not
>> ctakes-resources-snomed-rword-hsqldb-2011ab) but I wasn't able to
>> write directly in trunk (??) and need to get moving on to other things.
>>
>> There is help on the ctakes wiki:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
>> confluence_display_CTAKES_cTAKES-2B3.2-2B-2D-2BFast-2BDictionary-2BLoo
>> kup&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZ
>> stTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3FmuUXq77GmVlfXn0lE0pVRkL53
>> DNhukcaW6c&s=98W_vAHGZ2FLEMPfrSgEHtZt-mQ3XJjF6yQYM26tqP4&e=
>> Though I should probably add a few items ...
>>
>>
>> Sean
>>
>>
>> -----Original Message-----
>> From: Geise, Brandon D. [mailto:bdge...@geisinger.edu]
>> Sent: Tuesday, December 08, 2015 12:51 PM
>> To: dev@ctakes.apache.org
>> Subject: RE: ctakes with icd10
>>
>> Not to perpetuate the instructions again but I sent these out not long
>> ago when I was going through the process and Sean was helping me.
>>
>>         1. Change /data/default/CtakesSources.txt from "SNOMEDCT" to
>> "SNOMEDCT_US"
>>         2. Copy ctakesumls.properties and ctakesumls.script from
>> memdbtemplate to location to put new UMLS DB
>>         3. Run DictionaryCreator2
>>         java -cp dictionarytool.jar;lib/*
>> org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls
>> "\pathToUmls\META" -atui ./data/tiny/CtakesAnatTuis.txt -db
>> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS
>>         4. Run CodeMapCreator
>>         java -cp dictionarytool.jar;lib/*
>> org.apache.ctakes.dictionarytool.CodeMapCreator -umls "\pathToUmls\META"
>> -atui ./data/tiny/CtakesAnatTuis.txt -db
>> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS
>>         5. Copy new DB files to new location and create a copy of
>> cTakesHsql.xml and update dictionary location
>>
>> Thanks,
>> Brandon
>>
>> -----Original Message-----
>> From: David Kincaid [mailto:kincaid.d...@gmail.com]
>> Sent: Tuesday, December 08, 2015 12:47 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: ctakes with icd10
>>
>> This seems like a pretty common request and with such an old version
>> of UMLS database shipped with cTAKES it's only going to get worse.
>> I've been wanting to build a dictionary using the latest UMLS release
>> (as well as a custom database), so would be happy to write up the
>> steps as I go through it. That assumes that I can dig up the instructions in 
>> the dev list.
>>
>> - Dave
>>
>> On Tue, Dec 8, 2015 at 11:36 AM, Finan, Sean <
>> sean.fi...@childrens.harvard.edu> wrote:
>>
>> > Hi Alaa,
>> >
>> > The -shortest- answer is that you'll need to run the dictionary
>> > creation tool.  There are instructions in older devlist threads.  By
>> > default the dictionary creation tool does add icd9 and icd10 tables
>> > to
>> the dictionary.
>> > The problem is that in Umls 2011AB those codes weren't very well
>> > populated.  The 2015AB icd# set is much more rich so those tables
>> > should be pretty good.  Then in ctakes you would look up annotations
>> > by icd9 or icd10 codes instead of by cui:
>> > OntologyConceptUtil.getAnnotationsByCode( jcas, lookupWindow,
>> > icd#Code ); OntologyConceptUtil.getAnnotationsByCode( jcas, icd#Code
>> > );
>> >
>> > Sean
>> >
>> > -----Original Message-----
>> > From: Savova, Guergana
>> > [mailto:guergana.sav...@childrens.harvard.edu]
>> > Sent: Tuesday, December 08, 2015 12:17 PM
>> > To: dev@ctakes.apache.org
>> > Subject: RE: ctakes with icd10
>> >
>> > Hi Alaa,
>> > You need to create a resource off the terminology/ontology you want
>> > to use (in this case ICD9 or ICD10). Then run that resource with
>> > cTAKES for the fast dictionary lookup. There is cTAKES code and some
>> > documentation on how to create that resource. By default, cTAKES
>> > runs with a resource created from the English version of SNOMED CT and 
>> > RxNORM.
>> > Hope this helps.
>> > --Guergana
>> >
>> > -----Original Message-----
>> > From: Alaa al Barari [mailto:alaa.albar...@gmail.com]
>> > Sent: Tuesday, December 8, 2015 10:01 AM
>> > To: dev@ctakes.apache.org
>> > Subject: ctakes with icd10
>> >
>> > Hi,
>> >
>> > I downloaded Latest umls version, and I want to know how to make
>> > ctakes work with icd10 and icd9.
>> >
>> >
>> > Thanks
>> >
>>
>>
>> IMPORTANT WARNING: The information in this message (and the documents
>> attached to it, if any) is confidential and may be legally privileged.
>> It is intended solely for the addressee. Access to this message by
>> anyone else is unauthorized. If you are not the intended recipient,
>> any disclosure, copying, distribution or any action taken, or omitted
>> to be taken, in reliance on it is prohibited and may be unlawful. If
>> you have received this message in error, please delete all electronic
>> copies of this message (and the documents attached to it, if any),
>> destroy any hard copies you may have created and notify me immediately by 
>> replying to this email. Thank you.
>>
>> Geisinger Health System utilizes an encryption process to safeguard
>> Protected Health Information and other confidential data contained in
>> external e-mail messages. If email is encrypted, the recipient will
>> receive an e-mail instructing them to sign on to the Geisinger Health
>> System Secure E-mail Message Center to retrieve the encrypted e-mail.
>>

Reply via email to