How are you creating your pipeline? Do you run the command line script, run with an xml descriptor file in the CPE gui, or have you written a java class that creates the pipeline? DictionaryDescriptor points ctakes to your custom configuration file. Without it ctakes is probably defaulting to the 2011ab dictionary.
-----Original Message----- From: Martijn [mailto:mgkersl...@uvic.ca] Sent: Tuesday, March 14, 2017 1:15 PM To: dev@ctakes.apache.org Subject: Re: 2016AB UMLS (ctakessnorx) That parameter isn’t in my xml file, this is: <?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_licenses_LICENSE-2D2.0&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=P0xfhOncZV933PcAlJWC8GVvYa5bdEa-GSNIyNpPY24&s=rLgFk8EK9BKhb_YdvD_pt0mftM2BTb-VD4ebF25WNpA&e= Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- New format for the .xml lookup specification. Uses table name and value type/class for Concept Factories. --> <lookupSpecification> <dictionaries> <dictionary> <name>custom2Terms</name> <implementationName>org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary</implementationName> <properties> <!-- urls for hsqldb memory connections must be file types in hsql 1.8. These file urls must be either absolute path or relative to current working directory. They cannot be based upon the classpath. Though JdbcConnectionFactory will attempt to "find" a db based upon the parent dir of the url for the sake of ide ease-of-use, the user should be aware of these hsql limitations. --> <property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/> <property key="jdbcUrl" value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/custom2/custom2"/> <property key="jdbcUser" value="sa"/> <property key="jdbcPass" value=""/> <property key="rareWordTable" value="cui_terms"/> <property key="umlsUrl" value="https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUser&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=P0xfhOncZV933PcAlJWC8GVvYa5bdEa-GSNIyNpPY24&s=HmVoM8EEm3NmCnXI3oatnPkjVkhdbAiEQIAW0D2Hx5I&e= "/> <property key="umlsVendor" value="NLM-6515182895"/> <property key="umlsUser" value="XXXX"/> <property key="umlsPass" value="XXXX"/> </properties> </dictionary> </dictionaries> <conceptFactories> <conceptFactory> <name>custom2Concepts</name> <implementationName>org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory</implementationName> <properties> <property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/> <property key="jdbcUrl" value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/custom2/custom2"/> <property key="jdbcUser" value="sa"/> <property key="jdbcPass" value=""/> <property key="umlsUrl" value="https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUser&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=P0xfhOncZV933PcAlJWC8GVvYa5bdEa-GSNIyNpPY24&s=HmVoM8EEm3NmCnXI3oatnPkjVkhdbAiEQIAW0D2Hx5I&e= "/> <property key="umlsVendor" value="NLM-6515182895"/> <property key="umlsUser" value="XXXX"/> <property key="umlsPass" value="XXXX"/> <property key="tuiTable" value="tui"/> <property key="prefTermTable" value="prefTerm"/> <!-- Optional tables for optional term info. Uncommenting these lines alone may not persist term information; persistence depends upon the TermConsumer. --> <property key="snomedct_usTable" value="long"/> </properties> </conceptFactory> </conceptFactories> <!-- Defines what terms and concepts will be used --> <dictionaryConceptPairs> <dictionaryConceptPair> <name>custom2Pair</name> <dictionaryName>custom2Terms</dictionaryName> <conceptFactoryName>custom2Concepts</conceptFactoryName> </dictionaryConceptPair> </dictionaryConceptPairs> <!-- DefaultTermConsumer will persist all spans. PrecisionTermConsumer will only persist only the longest overlapping span of any semantic group. SemanticCleanupTermConsumer works as Precision** but also removes signs/sympoms contained within disease/disorder, and (just in case) removes any s/s and d/d that are also (exactly) anatomical sites. --> <rareWordConsumer> <name>Term Consumer</name> <implementationName>org.apache.ctakes.dictionary.lookup2.consumer.DefaultTermConsumer</implementationName> <!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.PrecisionTermConsumer</implementationName>--> <!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.SemanticCleanupTermConsumer</implementationName>--> <properties> <!-- Depending upon the consumer, the value of codingScheme may or may not be used. With the packaged consumers, codingScheme is a default value used only for cuis that do not have secondary codes (snomed, rxnorm, etc.) --> <property key="codingScheme" value="custom2"/> </properties> </rareWordConsumer> </lookupSpecification> On 14/03/2017, 09:58, "Finan, Sean" <sean.fi...@childrens.harvard.edu> wrote: You are pointing to your "DictionaryDescriptor" parameter to your custom .xml configuration file? -----Original Message----- From: Martijn [mailto:mgkersl...@uvic.ca] Sent: Tuesday, March 14, 2017 12:50 PM To: dev@ctakes.apache.org Subject: Re: 2016AB UMLS (ctakessnorx) Apparently, my database browser created a new database file instead of opening the dictionary file (sorry for that!). When I open the correct file, it shows the SNOMEDCT_US table. Unfortunately, that doesn’t explain why cTAKES won’t return SNOMED concepts… On 14/03/2017, 09:44, "Finan, Sean" <sean.fi...@childrens.harvard.edu> wrote: That is very strange. How large is the database .script file? It is unlikely, but I wonder if the db library is running out of memory but not reporting the problem. -----Original Message----- From: Martijn [mailto:mgkersl...@uvic.ca] Sent: Tuesday, March 14, 2017 12:41 PM To: dev@ctakes.apache.org Subject: Re: 2016AB UMLS (ctakessnorx) According to the output the database should be filled, but when I browse it, it’s empty. INFO RareWordDbWriter:168 - Main Table Rows 341512 INFO RareWordDbWriter:169 - Tui Table Rows 242342 INFO RareWordDbWriter:170 - Preferred Term Table Rows 220791 INFO RareWordDbWriter:184 - SNOMEDCT_US Table Rows 230545 INFO MainPanel:182 - Dictionary custom2 successfully built in On 14/03/2017, 09:31, "Kean Kaufmann" <k...@recordsone.com> wrote: FWIW, I also ran the GUI in the last few weeks and got all the secondary tables for the sources I selected, including SNOMEDCT_US. On Tue, Mar 14, 2017 at 12:18 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Martijn, > That is very strange. I don't know why the database would have an empty > table. It is acting like snomed codes were not found for any of your cuis > in your local umls installation, but that is terribly unlikely. > > I ran the gui about two weeks ago and the secondary database tables were > populated. > > Sorry that I can't help, this is unfortunate, > Sean > > -----Original Message----- > From: Martijn [mailto:mgkersl...@uvic.ca] > Sent: Tuesday, March 14, 2017 12:07 PM > To: dev@ctakes.apache.org > Subject: Re: 2016AB UMLS (ctakessnorx) > > Hi Sean, > > It’s an array, but I only sent you a single item out of that array. > > The .xml file lists the snomedct_us table, but the database is empty. I’m > sure that I selected SNOMEDCT_US in the dictionarygui tool. Do you have any > idea why the database could be empty? > > Thanks. > > - Martijn > > On 10/03/2017, 10:33, "Finan, Sean" <sean.fi...@childrens.harvard.edu> > wrote: > > Hi Martijn, > > The UmlsConcept should be in an array in the IdentifiedAnnotation. > Does your array only contain a single concept? > > When you use the gui, it should store codes in the database as a > unique table for every (target) vocabulary that you selected in the left > panel. The .xml file that it creates should list all of those types in the > <conceptFactory> section. In your .xml you should see the line: > <property key="snomedct_usTable" value="long"/> > > If that line is in the .xml, you can inspect your database directly > with a hsqldb tool. I can help you do that if needed. The db should have > a table named "snomedct_us". > > If all of those are ok then I will need to look at the lookup code as > something must have broken. > > Sean > > -----Original Message----- > From: Martijn [mailto:mgkersl...@uvic.ca] > Sent: Thursday, March 09, 2017 5:35 PM > To: dev@ctakes.apache.org > Subject: Re: 2016AB UMLS (ctakessnorx) > > Hi Sean, > > Thanks! I used the GUI to generate a custom dictionary, but I still > get the UMLS code and not the SNOMED CT one. > > If I print the concept that was detected it returns: > UmlsConcept > codingScheme: "custom" > code: <null> > oid: "null#custom" > oui: <null> > score: 0.0 > disambiguated: false > cui: "C1281583" > tui: "T023" > preferredText: "Entire hand" > > As you can see, the code is <null>. The concept is present in the UMLS > subset and there is also a SNOMED CT code listed there: > > Entire hand [A3421866/SNOMEDCT_US/PT] CUI:C1281583 SCUI:302539009 > > > Am I doing something wrong? > > - Martijn > > On 08/03/2017, 08:12, "Finan, Sean" <sean.fi...@childrens.harvard.edu> > wrote: > > Hi Martijn, > > The dictionary creator gui is in sandbox just like the command > line tool, but it is newer and easier to use. > > OntologyConceptUtil is in org.apache.ctakes.core.util. > > Sean > > -----Original Message----- > From: Martijn [mailto:mgkersl...@uvic.ca] > Sent: Tuesday, March 07, 2017 5:38 PM > To: dev@ctakes.apache.org > Subject: Re: 2016AB UMLS (ctakessnorx) > > Hi Sean, > > Thanks so much for your quick reply. > I used the command line directorytool. Is that different than the > gui? Can that explain the decrease in tagged concepts? > > I’m not able to find the OntologyConceptUtil class, may I ask what > the path for that class is? > > - Martijn > > On 07/03/2017, 14:24, "Finan, Sean" <Sean.Finan@childrens.harvard. > edu> wrote: > > Hi Martijn, > > Since you say that you've created your own dictionary I will > assume that you used the gui in sandbox to do so. If that isn't the case > then let me know. > > The any dictionary created using the default settings on the > gui does have snomedct and rxnorm codes in addition to the cuis. However, > umls cui is always used as the primary normalization code for ctakes > annotations. > > To obtain codes for an annotation, check the > OntologyConceptUtil in ctakes core. It has methods that will return all > associated codes as well as one to get all associated codes for a > scheme/vocabulary (like snomedct_us, etc.). It can do this for a single > annotation, a collection of annotations, the entire document, or a section > of the document (sentence, paragraph, section). It also has methods that > allow you to fetch annotations found in the document by codes other than > the umls cui. > > Sean > > > -----Original Message----- > From: Martijn [mailto:mgkersl...@uvic.ca] > Sent: Tuesday, March 07, 2017 5:13 PM > To: dev@ctakes.apache.org > Subject: 2016AB UMLS (ctakessnorx) > > Hi, > > I've been using cTAKES for a bit now, but I still can't figure > out how to upgrade the UMLS version to the most recent one. > If I create my own dictionary, cTAKES only returns UMLS > concepts and no SNOMED CT ones (I'm interested in those). The amount of > concepts returned is also way less compared to the 2011 UMLS that's > included with cTAKES. > > Can someone help me out by providing me a proper 2016 > dictionary or clear explanation how to implement the newest version of the > UMLS (with SNOMED CT). > > > Thanks! > > - Martijn > > > > > > > > > > > > > -- _____________________________________________________ *Kean Kaufmann* NLP Developer RecordsOne nSight Driven | *Priority. Clarity. Integrity. * *mobile* | 240-401-6131 *Twitter: **@R1_RecordsOne* --------------------------------------------------------------------------------------------------- *Confidentiality Notice: *This email, including any attachments is the property of RecordsOne, LLC and is intended for the sole use of the intended recipient(s). It may contain information that is privileged and confidential. Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not the intended recipient, please reply to the sender that you have received the message in error, then delete this message. --------------------------------------------------------------------------------------------------- *Mailing*: 10641 Airport Pulling Road, Suite 30 | Naples, FL 34109 *Main*: 239.451.6112 *Please consider the environmental impact before printing this email. *