Hi Sean, thank you (again) for your help and feedback! I'll give it a try! Seems like the authors of the publication "Mayo clinical Text analysis and Knowledge Extraction System" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995668/ <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995668/>) did this as well.
Thank you Leander > On 17 Mar 2017, at 18:33, Finan, Sean <sean.fi...@childrens.harvard.edu> > wrote: > > Hi Leander, > > There is no single correct way to do this, but a couple of similar classes > exist. Well, one sat in my sandbox for two years until about 5 seconds ago > as I only just checked it in. Anyway, take a look at two classes in > ctakes-core org.apache.ctakes.core > They are TextSpanWriter and CuiCountFileWriter. > > TextSpanWriter writes annotation name | span | covered text in a file, one > per document. > > CuiCountFileWriter writes a list of discovered cuis and their counts. > > It sounds like you are interested in a combination of both - basically > TextSpanWriter with the added output of CUIs. > > You can also have a look at EntityCollector of > org.apache.ctakes.core.pipeline. It has an annotation engine that keeps a > running list of "entities" for the whole run, doc ids, spans, text and cuis. > > Sean > > > -----Original Message----- > From: Leander Melms [mailto:me...@students.uni-marburg.de] > Sent: Friday, March 17, 2017 1:09 PM > To: dev@ctakes.apache.org > Subject: Re: Evaluate cTAKES perfomance > > Sorry for writing again. I just have a quick question: My idea is to parse > the cTAKES output to a text file with a structure like this > DocName|Spans|CUI|CoveredText|ConceptType and do the same with the cold > standart (from anafora). > > Is this a correct way to do this? > > I'm new to the subject and happy about the tiniest information on the topic. > > Thanks > Leander > > I >> On 17 Mar 2017, at 12:05, Leander Melms <me...@students.uni-marburg.de> >> wrote: >> >> Hi, >> >> I've integrated a custom dictionary, retrained some of the OpenNLP models >> and would like to evaluate the changes on a gold standard. I'd like to >> calculate the precision, the recall and the f1-score to compare the results. >> >> My question is: Does cTAKES ship with some evaluation / test scripts? What >> is the best strategry to do this? Has anyone dealt with this topic before? >> >> I'm happy to share the results afterwards if there is interest for it. >> >> Thanks >> Leander >> > >