Sorry, my mistake, it was still running the old dictionary lookups.
Since your earlier question, I have been trying to get the lookup-fast to
work and have not yet been successful.
I made the change to AgregatePlaintextUMLSProcessor.xml:
<!--
<delegateAnalysisEngine key="DictionaryLookupAnnotatorDB">
<import
location="../../../ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml"/>
</delegateAnalysisEngine>
-->
<delegateAnalysisEngine key="DictionaryLookupAnnotatorDB">
<import
location="../../../ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml"/>
</delegateAnalysisEngine>
But I've been getting the following exception and trying to figure out why:
Caused by: org.apache.uima.resource.ResourceInitializationException: Could
not access the resource data at
file:org/apache/ctakes/dictionary/lookup/fast/cTakesHsql.xml.
at
org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:127)
at
org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123)
... 31 more
[image: IMAT Solutions] <http://imatsolutions.com>
Bruce Tietjen
Senior Software Engineer
[image: Mobile:] 801.634.1547
[email protected]
On Thu, Oct 9, 2014 at 11:42 AM, Finan, Sean <
[email protected]> wrote:
> I just ran the –fast with an example containing bacitracin in four
> sentences, once being the first word and once being the last. In ten of
> ten runs all four bacitracin mentions were discovered.
>
> You completely replaced the dictionary lookup with ?
> <delegateAnalysisEngine key="DictionaryLookupAnnotatorDB">
> <import
> location="../../../ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml"/>
> </delegateAnalysisEngine>
>
>
> From: Bruce Tietjen [mailto:[email protected]]
> Sent: Thursday, October 09, 2014 11:42 AM
> To: [email protected]
> Subject: Re: Differences in MedicationMention annotations on subsequent
> processing runs
>
> I tried the Dictionary-lookup-fast module and the bahavior is the same. I
> did have to run it a number of times before timing was right to reproduce
> the issue. With the older lookup, chances were about 50/50 between which
> dictionary ran first. Using the dictionary-fast, it seems more like 70/30
> with the standard umls lookup being more likely to run first than not.
> Which means that most of the time, there is no MedicationMention annotation
> for Bacitracin. (See Attached)
> The code with the issue is the DictionaryLookupAnnotator which is a
> container for the dictionaries and it iterates through the list of lookup
> dictionaries so that part of the code path does not seem to have changed.
> In the past, the rxNorm dictionary was a Lucene search and so I'm guessing
> it behaved a little differently than it does now with both being JDBC.
> The fact that the filter is at this location seems to indicate that it may
> have been by intended for it to be across all dictionaries. On the other
> hand, it appears to mask out the lookups for the different dictionaries,
> resulting in some annotations not being made.
>
> So, the real question is how should the filter work -- should the
> annotation filtering be per lookup dictionary, or be across all
> dictionaries? Or is there something wrong elsewhere that causes
> I lean towards having the filter function per dictionary. This may risk
> having duplicate annotations, but that would probably be better than
> missing the annotation all together.
>
>
>
>
>
> [IMAT Solutions]<http://imatsolutions.com>
> Bruce Tietjen
> Senior Software Engineer
> [Mobile:]801.634.1547
> [email protected]<mailto:[email protected]>
>
> On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean <
> [email protected]<mailto:[email protected]>>
> wrote:
> Hi Bruce,
>
> With Pei's help I just updated the sourceforge repo with the cTakes
> dictionaries. Checkout artifact ctakes-resources-snomed-rword-hsqldb-2011ab
>
> Sean
>
> -----Original Message-----
> From: Bruce Tietjen [mailto:[email protected]<mailto:
> [email protected]>]
> Sent: Wednesday, October 08, 2014 11:52 AM
> To: [email protected]<mailto:[email protected]>
> Subject: Re: Differences in MedicationMention annotations on subsequent
> processing runs
>
> If I understand correctly, I would need new dictionary resources to run the
> rare word lookup method.
>
> Where can I find the necessary dictionary(ies) or how do I build them?
>
>
> [image: IMAT Solutions] <http://imatsolutions.com>
> Bruce Tietjen
> Senior Software Engineer
> [image: Mobile:] 801.634.1547<tel:801.634.1547>
> [email protected]<mailto:[email protected]>
>
> On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean <
> [email protected]<mailto:[email protected]>>
> wrote:
>
> > Hi Bruce,
> >
> > I would venture to say that this is neither expected nor desired.
> >
> >
> >
> > Before you fix it (or in addition to a fix), try to run with the new
> > dictionary lookup. It will have a different behavior, and it will be
> the
> > default dictionary lookup in future releases of cTakes – making fixes to
> > the current module slightly less urgent.
> >
> >
> >
> > Sean
> >
> >
> >
> > *From:* Bruce Tietjen [mailto:[email protected]
> <mailto:[email protected]>]
> > *Sent:* Wednesday, October 08, 2014 11:38 AM
> > *To:* [email protected]<mailto:[email protected]>
> > *Subject:* Differences in MedicationMention annotations on subsequent
> > processing runs
> >
> >
> >
> >
> >
> > I have encountered a situation in which the cTakes clinical pipeline
> > output differs between multiple runs on the same text with the same
> > configuration.
> >
> > The following snippets from a single document are sufficient to
> > demonstrate the issue:
> >
> > a gentle curve going into. irrigated with Bacitracin.
> >
> >
> >
> > The source of the difference is that the DictionaryLookupAnnotator uses a
> > map to filter out duplicate annotations for a single document location:
> >
> > // used to prevent duplicate hits
> > // key = hit begin,end key (java.lang.String)
> > // val = Set of MetaDataHit objects
> > private Map<String,Set<MetaDataHit>> iv_dupMap = new HashMap<>();
> >
> > This map is shared between both the umls_ms_2011ab lookup and the
> > umls_ms_2011an_rxnorm lookup,
> >
> >
> >
> > If both dictionaries contain the same term, the order of dictionary
> lookup
> > execution determines the output.If the rxnorm lookup runs first, then a
> > MedicationMention annotation for Bacitracin appears in the final output.
> If
> > the standard umls lookup runs first, then there is no MedicationMention
> > annotation for Bacitracin.
> >
> > I will attach the output from the subsequent runs. (Hopefully the
> > attachment will make it through the system)
> >
> >
> >
> > Is this expected behavior? If not, what would be the expected behavior?
> >
> >
> >
> > [image: Image removed by sender. IMAT Solutions]
> > <http://imatsolutions.com>
> >
> > *Bruce Tietjen*
> > Senior Software Engineer
> > [image: Image removed by sender. Mobile:]801.634.1547<tel:801.634.1547>
> > [email protected]<mailto:[email protected]>
> >
>
>