Re: cTakes Annotation Comparison

Bruce Tietjen Fri, 19 Dec 2014 11:02:49 -0800

Rather than spam the mailing list with the list of filenames for the files
in the set we used, I would be happy to send it to anyone interested
privately.



 [image: IMAT Solutions] <http://imatsolutions.com>
 Bruce Tietjen
Senior Software Engineer
[image: Mobile:] 801.634.1547
[email protected]

On Fri, Dec 19, 2014 at 11:47 AM, Kim Ebert <[email protected]>
wrote:
>
>  Pei,
>
> I don't think bugs/issues should be part of determining if one algorithm
> vs the other is superior. Obviously, it is worth mentioning the bugs, but
> if the fast lookup method has worse precision and recall but better
> performance, vs the slower but more accurate first word lookup algorithm,
> then time should be invested in fixing those bugs and resolving those weird
> issues.
>
> Now I'm not saying which one is superior in this case, as the data will
> end up speaking for itself one way or the other; bus as of right now, I'm
> not convinced yet that the old dictionary lookup is obsolete yet, and I'm
> not sure the community is convinced yet either.
>
>
>  [image: IMAT Solutions] <http://imatsolutions.com>
>  Kim Ebert
> Software Engineer
> [image: Office:] 801.669.7342
> [email protected] <[email protected]>
>  On 12/19/2014 08:39 AM, Chen, Pei wrote:
>
>  Also check out stats that Sean ran before releasing the new component on:
>
>
> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx
>
> From the evaluation and experience, the new lookup algorithm should be a
> huge improvement in terms of both speed and accuracy.
>
> This is very different than what Bruce mentioned…  I’m sure Sean will
> chime here.
>
> (The old dictionary lookup is essentially obsolete now- plagued with
> bugs/issues as you mentioned.)
>
> --Pei
>
>
>
> *From:* Kim Ebert [mailto:[email protected]
> <[email protected]>]
> *Sent:* Friday, December 19, 2014 10:25 AM
> *To:* [email protected]
> *Subject:* Re: cTakes Annotation Comparison
>
>
>
> Guergana,
>
> I'm curious to the number of records that are in your gold standard sets,
> or if your gold standard set was run through a long running cTAKES process.
> I know at some point we fixed a bug in the old dictionary lookup that
> caused the permutations to become corrupted over time. Typically this isn't
> seen in the first few records, but over time as patterns are used the
> permutations would become corrupted. This caused documents that were fed
> through cTAKES more than once to have less codes returned than the first
> time.
>
> For example, if a permutation of 4,2,3,1 was found, the permutation would
> be corrupted to be 1,2,3,4. It would no longer be possible to detect
> permutations of 4,2,3,1 until cTAKES was restarted. We got the fix in after
> the cTAKES 3.2.0 release. https://issues.apache.org/jira/browse/CTAKES-310
> Depending upon the corpus size, I could see the permutation engine
> eventually only have a single permutation of 1,2,3,4.
>
> Typically though, this isn't very easily detected in the first 100 or so
> documents.
>
> We discovered this issue when we made cTAKES have consistent output of
> codes in our system.
>
>
>
> [image: IMAT Solutions] <http://imatsolutions.com>
>
> *Kim Ebert*
> Software Engineer
> [image: Office:]801.669.7342
> [email protected] <[email protected]>
>
> On 12/19/2014 07:05 AM, Savova, Guergana wrote:
>
> We are doing a similar kind of evaluation and will report the results.
>
>
>
> Before we released the Fast lookup, we did a systematic evaluation across 
> three gold standard sets. We did not see the trend that Bruce reported below. 
> The P, R and F1 results from the old dictionary look up and the fast one were 
> similar.
>
>
>
> Thank you everyone!
>
> --Guergana
>
>
>
> -----Original Message-----
>
> From: David Kincaid [mailto:[email protected] <[email protected]>]
>
> Sent: Friday, December 19, 2014 9:02 AM
>
> To: [email protected]
>
> Subject: Re: cTakes Annotation Comparison
>
>
>
> Thanks for this, Bruce! Very interesting work. It confirms what I've seen in 
> my small tests that I've done in a non-systematic way. Did you happen to 
> capture the number of false positives yet (annotations made by cTAKES that 
> are not in the human adjudicated standard)? I've seen a lot of dictionary 
> hits that are not actually entity mentions, but I haven't had a chance to do 
> a systematic analysis (we're working on our annotated gold standard now). One 
> great example is the antibiotic "Today". Every time the word today appears in 
> any text it is annotated as a medication mention when it almost never is 
> being used in that sense.
>
>
>
> These results by themselves are quite disappointing to me. Both the 
> UMLSProcessor and especially the FastUMLSProcessor seem to have pretty poor 
> recall. It seems like the trade off for more speed is a ten-fold (or more) 
> decrease in entity recognition.
>
>
>
> Thanks again for sharing your results with us. I think they are very useful 
> to the project.
>
>
>
> - Dave
>
>
>
> On Thu, Dec 18, 2014 at 5:06 PM, Bruce Tietjen < 
> [email protected]> wrote:
>
>
>
> Actually, we are working on a similar tool to compare it to the human
>
> adjudicated standard for the set we tested against.  I didn't mention
>
> it before because the tool isn't complete yet, but initial results for
>
> the set (excluding those marked as "CUI-less") was as follows:
>
>
>
> Human adjudicated annotations: 4591 (excluding CUI-less)
>
>
>
> Annotations found matching the human adjudicated standard
>
> UMLSProcessor                  2245
>
> FastUMLSProcessor           215
>
>
>
>
>
>
>
>
>
>
>
>
>
>  [image: IMAT Solutions] <http://imatsolutions.com> 
> <http://imatsolutions.com>  Bruce Tietjen
>
> Senior Software Engineer
>
> [image: Mobile:] 801.634.1547
>
> [email protected]
>
>
>
> On Thu, Dec 18, 2014 at 3:37 PM, Chen, Pei
>
> <[email protected]
>
>
>
>  wrote:
>
>
>
> Bruce,
>
> Thanks for this-- very useful.
>
> Perhaps Sean Finan comment more-
>
> but it's also probably worth it to compare to an adjudicated human
>
> annotated gold standard.
>
>
>
> --Pei
>
>
>
> -----Original Message-----
>
> From: Bruce Tietjen [mailto:[email protected] 
> <[email protected]>]
>
> Sent: Thursday, December 18, 2014 1:45 PM
>
> To: [email protected]
>
> Subject: cTakes Annotation Comparison
>
>
>
> With the recent release of cTakes 3.2.1, we were very interested in
>
> checking for any differences in annotations between using the
>
> AggregatePlaintextUMLSProcessor pipeline and the
>
> AggregatePlanetextFastUMLSProcessor pipeline within this release of
>
>  cTakes
>
>  with its associated set of UMLS resources.
>
>
>
> We chose to use the SHARE 14-a-b Training data that consists of 199
>
> documents (Discharge  61, ECG 54, Echo 42 and Radiology 42) as the
>
> basis for the comparison.
>
>
>
> We decided to share a summary of the results with the development
>
> community.
>
>
>
> Documents Processed: 199
>
>
>
> Processing Time:
>
> UMLSProcessor           2,439 seconds
>
> FastUMLSProcessor    1,837 seconds
>
>
>
> Total Annotations Reported:
>
> UMLSProcessor                  20,365 annotations
>
> FastUMLSProcessor             8,284 annotations
>
>
>
>
>
> Annotation Comparisons:
>
> Annotations common to both sets:                                  3,940
>
> Annotations reported only by the UMLSProcessor:         16,425
>
> Annotations reported only by the FastUMLSProcessor:    4,344
>
>
>
>
>
> If anyone is interested, following was our test procedure:
>
>
>
> We used the UIMA CPE to process the document set twice, once using
>
> the AggregatePlaintextUMLSProcessor pipeline and once using the
>
> AggregatePlaintextFastUMLSProcessor pipeline. We used the
>
> WriteCAStoFile CAS consumer to write the results to output files.
>
>
>
> We used a tool we recently developed to analyze and compare the
>
> annotations generated by the two pipelines. The tool compares the
>
> two outputs for each file and reports any differences in the
>
> annotations (MedicationMention, SignSymptomMention,
>
> ProcedureMention, AnatomicalSiteMention, and
>
> DiseaseDisorderMention) between the two output sets. The tool
>
> reports the number of 'matches' and 'misses' between each annotation set. A 
> 'match'
>
>  is
>
>  defined as the presence of an identified source text interval with
>
> its associated CUI appearing in both annotation sets. A 'miss' is
>
> defined as the presence of an identified source text interval and
>
> its associated CUI in one annotation set, but no matching identified
>
> source text interval
>
>  and
>
>  CUI in the other. The tool also reports the total number of
>
> annotations (source text intervals with associated CUIs) reported in
>
> each annotation set. The compare tool is in our GitHub repository at
>
> https://github.com/perfectsearch/cTAKES-compare
>
>
>
>
>
>
>
>
>

Re: cTakes Annotation Comparison

Reply via email to