Hi Andrey, The CDA sectionizer is a rule/RegEx based method for section header matching. It follows the consolidated CDA/HL7 standard for defining a section header template. The template format is: HL7 template id, LOINC Section Code, and a list of n header names (case insensitive, n can be as many as possible)
For example, a history related section-header template can be defined as: history,1,brief history of physical illness,history of present illness,history of the present illness ³history² is the entry id (named by yourself), ³1² is the Section code (named by yourself), The rest are the permutation of history-section headers that appear in a dataset. Note it is very specific, if you only list ³history of present illness², it will not find ³history of [the] present illness² unless you list both. As you can see it¹s a strict template matching algorithm, so if you know your data, especially all the section headers, it can surely do the job. I have used CDA sectionizer for two projects. Those notes I processed were with standard section header format so the performance was acceptable. Hope it is helpful. Best, Chen On 2/22/17, 3:23 AM, "Andrey Kurdumov" <kant2...@googlemail.com> wrote: >Does anybody know what expected performance of the current CDA section >finder in cTakes? > >How it was created, since I don't see any test cases for it? Does >it was created on public or private dataset? > >Best regards, >Andrey Kurdyumov