This is exactly what I was looking for. I've read your answer a liitle late 
though and I have written a Python script which output something like this:

Strict Mode (#2)
╭───────────┬────────
│ iteration │ precision │  recall   │ f1 score  │
├───────────┼────────
│          0│    0.66667│    0.33333│    0.44444│
│          1│    0.66667│    0.45455│    0.54054│
╰───────────┴─────────
╭────────────────────┬───────────
│      Measure       │     Macro (SD)     │       Micro        │         F1     
    │
├────────────────────┼───────────
│           Precision│        0.6667 (0.0)│              0.6667│              
0.4952│
│              Recall│      0.3939 (0.061)│              0.3846│              
0.4878│
╰────────────────────┴───────────

It needs some testing and a clean up. I'll create a git repo once it's done!

Leander



> On 19 Mar 2017, at 16:55, Finan, Sean <sean.fi...@childrens.harvard.edu> 
> wrote:
> 
> Great explanation, 
> Thank you Tim!
> 
> -----Original Message-----
> From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] 
> Sent: Saturday, March 18, 2017 7:18 AM
> To: dev@ctakes.apache.org
> Subject: Re: Evaluate cTAKES perfomance [SUSPICIOUS]
> 
> To save you a little trouble, in ctakes-temporal we rely a lot on an outside 
> library called ClearTK that has some evaluation APIs built in that work well 
> with UIMA frameworks and typical NLP tasks. We use the following classes:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cleartk.github.io_cleartk_apidocs_2.0.0_org_cleartk_eval_AnnotationStatistics.html&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=lKr9UzntVnVdsEbHHjtjhfCS3BgJa6dyTE9LsTnhLkA&s=PUUopYYvh-wxt0oYmHdevHhjzYZh19cvYGae-3pQOd8&e=
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cleartk.github.io_cleartk_apidocs_2.0.0_org_cleartk_eval_Evaluation-5FImplBase.html&d=DwIFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=lKr9UzntVnVdsEbHHjtjhfCS3BgJa6dyTE9LsTnhLkA&s=MP2Jy56D9Rj58htcPx5g_oX_Ca-ACJVdAJnysg2H0Uc&e=
>  
> 
> The simplest place to start looking in ctakes-temporal is probably the 
> EventAnnotator and its evaluation, since they are simple one word spans. Then 
> the TimeAnnotator is slightly more complicated with multi-word spans. Then if 
> you are interested in evaluating relations I would suggest switching over to 
> ctakes-relation-extractor which is more stable than the ctakes-temporal 
> relation code, which is an area of highly active (i.e., funded) research and 
> so the code has not been cleaned up as much.
> Tim
> 
> ________________________________________
> From: Leander Melms <me...@students.uni-marburg.de>
> Sent: Friday, March 17, 2017 3:05 PM
> To: dev@ctakes.apache.org
> Subject: Re: Evaluate cTAKES perfomance
> 
> Thanks! I'll have a look at it and will try to give something back to the 
> community!
> 
> Leander
> 
> 
>> On 17 Mar 2017, at 19:42, Finan, Sean <sean.fi...@childrens.harvard.edu> 
>> wrote:
>> 
>> Ah - you meant best way to test.  Sorry, I misread your inquiry as a best 
>> way to write output.
>> 
>> Yes, that is a great introduction document for ctakes and early tests.  
>> There are a few small test classes in ctakes that read anafora files, run 
>> ctakes and run agreement numbers.  You can find some in the ctakes-temporal 
>> module.  I didn't write them, and I think that they are built-to-fit 
>> purpose-driven classes, but you could try to adapt them to a general purpose 
>> case.  That would be a great thing to have in ctakes!
>> 
>> Sean
>> 
>> -----Original Message-----
>> From: Leander Melms [mailto:me...@students.uni-marburg.de]
>> Sent: Friday, March 17, 2017 1:46 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: Evaluate cTAKES perfomance
>> 
>> Hi Sean,
>> 
>> thank you (again) for your help and feedback! I'll give it a try! Seems like 
>> the authors of the publication "Mayo clinical Text analysis and Knowledge 
>> Extraction System" 
>> (https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_pmc_articles_PMC2995668_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=PZ0f8s12PJA8W5B4hMlw-0F83VAM9m6E1ypWVaT2hcM&s=Isgii7k_fUy_qLsyqEdh15wKLAnFT6_KeE7zN1dE73Q&e=
>>   
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ncbi.nlm.nih.gov_pmc_articles_PMC2995668_&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=PZ0f8s12PJA8W5B4hMlw-0F83VAM9m6E1ypWVaT2hcM&s=Isgii7k_fUy_qLsyqEdh15wKLAnFT6_KeE7zN1dE73Q&e=>)
>>  did this as well.
>> 
>> Thank you
>> Leander
>> 
>> 
>> 
>>> On 17 Mar 2017, at 18:33, Finan, Sean <sean.fi...@childrens.harvard.edu> 
>>> wrote:
>>> 
>>> Hi Leander,
>>> 
>>> There is no single correct way to do this, but a couple of similar 
>>> classes exist.  Well, one sat in my sandbox for two years until about 5 
>>> seconds ago as I only just checked it in.  Anyway, take a look at two 
>>> classes in ctakes-core org.apache.ctakes.core They are TextSpanWriter and 
>>> CuiCountFileWriter.
>>> 
>>> TextSpanWriter writes annotation name | span | covered text in a file, one 
>>> per document.
>>> 
>>> CuiCountFileWriter writes a list of discovered cuis and their counts.
>>> 
>>> It sounds like you are interested in a combination of both - basically 
>>> TextSpanWriter with the added output of CUIs.
>>> 
>>> You can also have a look at EntityCollector of 
>>> org.apache.ctakes.core.pipeline.  It has an annotation engine that keeps a 
>>> running list of "entities" for the whole run, doc ids, spans, text and cuis.
>>> 
>>> Sean
>>> 
>>> 
>>> -----Original Message-----
>>> From: Leander Melms [mailto:me...@students.uni-marburg.de]
>>> Sent: Friday, March 17, 2017 1:09 PM
>>> To: dev@ctakes.apache.org
>>> Subject: Re: Evaluate cTAKES perfomance
>>> 
>>> Sorry for writing again. I just have a quick question: My idea is to parse 
>>> the cTAKES output to a text file with a structure like this 
>>> DocName|Spans|CUI|CoveredText|ConceptType and do the same with the cold 
>>> standart (from anafora).
>>> 
>>> Is this a correct way to do this?
>>> 
>>> I'm new to the subject and happy about the tiniest information on the topic.
>>> 
>>> Thanks
>>> Leander
>>> 
>>> I
>>>> On 17 Mar 2017, at 12:05, Leander Melms <me...@students.uni-marburg.de> 
>>>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I've integrated a custom dictionary, retrained some of the OpenNLP models 
>>>> and would like to evaluate the changes on a gold standard. I'd like to 
>>>> calculate the precision, the recall and the f1-score to compare the 
>>>> results.
>>>> 
>>>> My question is: Does cTAKES ship with some evaluation / test scripts? What 
>>>> is the best strategry to do this? Has anyone dealt with this topic before?
>>>> 
>>>> I'm happy to share the results afterwards if there is interest for it.
>>>> 
>>>> Thanks
>>>> Leander
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

Reply via email to