RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison : Span Overlap addendum

2015-01-09 Thread Finan, Sean
d match". This is pretty lenient, but seems to work in my tests. "this kinda-sorta should ..." will not match ... though maybe '-' should be a special case. Let me know what you think. Enjoy, Sean -----Original Message- From: Masanz, James J. [mailto:masanz.ja...@

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Finan, Sean
3:57 PM To: 'dev@ctakes.apache.org' Subject: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison Sean (or others), Of the various configuration options described below, which values/choices would you recommend for best F1 measure for something like

dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Masanz, James J.
4 10:43 AM To: dev@ctakes.apache.org; kim.eb...@imatsolutions.com Subject: RE: cTakes Annotation Comparison Also check out stats that Sean ran before releasing the new component on: http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx From the evaluat

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
spot checks on the validity”. In other words, >>> when your script reports that a cui and/or span is missing, manually look >>> at the data and see if it really is. Just open up one .xmi in the CVD and >>> see what it looks like. >>> >>> >>> >>> Th

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
>> Sorry, I meant “Do some spot checks on the validity”. In other words, >>> when your script reports that a cui and/or span is missing, manually look >>> at the data and see if it really is. Just open up one .xmi in the CVD and >>> see what it looks like. &

RE: cTakes Annotation Comparison --- (^:

2014-12-19 Thread Finan, Sean
-Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 5:05 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison My apologies to Sean and everyone, I am happy to report that I found a bug in our analysis tools that

Re: cTakes Annotation Comparison

2014-12-19 Thread Pei Chen
ing, manually > look > >> at the data and see if it really is. Just open up one .xmi in the CVD > and > >> see what it looks like. > >> > >> > >> > >> Thanks, > >> > >> Sean > >> > >> > >> > &

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
>> see what it looks like. >> >> >> >> Thanks, >> >> Sean >> >> >> >> *From:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] >> *Sent:* Friday, December 19, 2014 3:37 PM >> *To:* dev@ctakes.apache.org >> *Subject:*

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
at it looks like. > > > > Thanks, > > Sean > > > > *From:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] > *Sent:* Friday, December 19, 2014 3:37 PM > *To:* dev@ctakes.apache.org > *Subject:* Re: cTakes Annotation Comparison > > > > M

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
[mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:37 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison My original results were using a newly downloaded cTakes 3.2.1 with the separately downloaded resources copied in. There were no changes to any of the

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
ld be so horribly inaccurate. Thanks -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:29 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Correction -- So far, I did steps 1 and 2 of Sean's ema

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
leaner code base. If this is not the case or if there are issues, we > > should fix it and move forward.). > > > > We can keep the old component around for as long as needed, but it’s > > likely going to have limited support… > > > > --Pei > > > > >

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
that you'd only have two matches per document (100 docs?). Thanks, Sean -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:23 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean, I

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
should fix it and move >> forward.). >> >> We can keep the old component around for as long as needed, but it’s >> likely going to have limited support… >> >> --Pei >> >> >> >> *From:* Kim Ebert [mailto:kim.eb...@imatsolutions.com] >>

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
...@imatsolutions.com] > *Sent:* Friday, December 19, 2014 1:47 PM > *To:* Chen, Pei; dev@ctakes.apache.org > > *Subject:* Re: cTakes Annotation Comparison > > > > Pei, > > I don't think bugs/issues should be part of determining if one algorithm > vs the oth

RE: cTakes Annotation Comparison

2014-12-19 Thread Chen, Pei
@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Pei, I don't think bugs/issues should be part of determining if one algorithm vs the other is superior. Obviously, it is worth mentioning the bugs, but if the fast lookup method has worse precision and recall but better performanc

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
lookup (that is to say: when working with the default lookup). From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 1:40 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean, I don't think that would be an issue since both the rare

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
(The old dictionary lookup is essentially obsolete now- plagued with > bugs/issues as you mentioned.) > > --Pei > > > > *From:* Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com > ] > *Sent:* Friday, December 19, 2014 10:25 AM > *To:* dev@ctakes.apache.org > *Subject:* Re: c

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
kim.eb...@perfectsearchcorp.com] > *Sent:* Friday, December 19, 2014 10:25 AM > *To:* dev@ctakes.apache.org > *Subject:* Re: cTakes Annotation Comparison > > > > Guergana, > > I'm curious to the number of records that are in your gold standard > sets,

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
I’m bringing it up in case the Human Annotations were done using a different version. From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 1:40 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean, I don't think that would be an

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
and moved from one TUI to another. > > Sean > > -Original Message- > From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] > Sent: Friday, December 19, 2014 1:28 PM > To: dev@ctakes.apache.org > Subject: RE: cTakes Annotation Comparison > > Several thoughts:

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
onths cuis are added, removed, deprecated, and moved from one TUI to another. Sean -Original Message- From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] Sent: Friday, December 19, 2014 1:28 PM To: dev@ctakes.apache.org Subject: RE: cTakes Annotation Comparison Se

RE: cTakes Annotation Comparison

2014-12-19 Thread Savova, Guergana
@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Our analysis against the human adjudicated gold standard from this SHARE corpus is using a simple check to see if the cTakes output included the annotation specified by the gold standard. The initial results I reported were for exact matches

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
fast one were > similar. > > Thank you everyone! > --Guergana > > -Original Message- > From: David Kincaid [mailto:kincaid.d...@gmail.com] > Sent: Friday, December 19, 2014 9:02 AM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: Re: cTakes Annota

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
look up and the fast > one were similar. > > Thank you everyone! > --Guergana > > -Original Message- > From: David Kincaid [mailto:kincaid.d...@gmail.com] > Sent: Friday, December 19, 2014 9:02 AM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
To: dev@ctakes.apache.org; kim.eb...@imatsolutions.com Subject: RE: cTakes Annotation Comparison Also check out stats that Sean ran before releasing the new component on: http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx From the evaluation and experie

Re: cTakes Annotation Comparison

2014-12-19 Thread Miller, Timothy
F1 results from the old dictionary look up and the fast one were similar. Thank you everyone! --Guergana -Original Message- From: David Kincaid [mailto:kincaid.d...@gmail.com] Sent: Friday, December 19, 2014 9:02 AM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: R

RE: cTakes Annotation Comparison

2014-12-19 Thread Chen, Pei
To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Guergana, I'm curious to the number of records that are in your gold standard sets, or if your gold standard set was run through a long running cTAKES process. I know at some point we fixed a bug in the old dictionary lookup

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
: David Kincaid [mailto:kincaid.d...@gmail.com] > Sent: Friday, December 19, 2014 9:02 AM > To: dev@ctakes.apache.org > Subject: Re: cTakes Annotation Comparison > > Thanks for this, Bruce! Very interesting work. It confirms what I've seen in > my small tests that I've d

Re: cTakes Annotation Comparison

2014-12-19 Thread David Kincaid
gt; -Original Message- > From: David Kincaid [mailto:kincaid.d...@gmail.com] > Sent: Friday, December 19, 2014 9:02 AM > To: dev@ctakes.apache.org > Subject: Re: cTakes Annotation Comparison > > Thanks for this, Bruce! Very interesting work. It confirms what I've seen > in my s

RE: cTakes Annotation Comparison

2014-12-19 Thread Savova, Guergana
were similar. Thank you everyone! --Guergana -Original Message- From: David Kincaid [mailto:kincaid.d...@gmail.com] Sent: Friday, December 19, 2014 9:02 AM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Thanks for this, Bruce! Very interesting work. It confirms what

Re: cTakes Annotation Comparison

2014-12-19 Thread David Kincaid
to compare to an adjudicated human > > annotated gold standard. > > > > --Pei > > > > -----Original Message- > > From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] > > Sent: Thursday, December 18, 2014 1:45 PM > > To: dev@ctakes.ap

Re: cTakes Annotation Comparison

2014-12-19 Thread John Green
gt; From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] >> Sent: Thursday, December 18, 2014 1:45 PM >> To: dev@ctakes.apache.org >> Subject: cTakes Annotation Comparison >> >> With the recent release of cTakes 3.2.1, we were very interested in >> checki

Re: cTakes Annotation Comparison

2014-12-18 Thread Bruce Tietjen
. > > --Pei > > -Original Message- > From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] > Sent: Thursday, December 18, 2014 1:45 PM > To: dev@ctakes.apache.org > Subject: cTakes Annotation Comparison > > With the recent release of cTakes 3.2.1, we were

RE: cTakes Annotation Comparison

2014-12-18 Thread Chen, Pei
er 18, 2014 1:45 PM To: dev@ctakes.apache.org Subject: cTakes Annotation Comparison With the recent release of cTakes 3.2.1, we were very interested in checking for any differences in annotations between using the AggregatePlaintextUMLSProcessor pipeline and the AggregatePlanetextFastUMLSProc

cTakes Annotation Comparison

2014-12-18 Thread Bruce Tietjen
With the recent release of cTakes 3.2.1, we were very interested in checking for any differences in annotations between using the AggregatePlaintextUMLSProcessor pipeline and the AggregatePlanetextFastUMLSProcessor pipeline within this release of cTakes with its associated set of UMLS resources. W