Kean Kaufmann created CTAKES-450:
------------------------------------

             Summary: CDASegmentAnnotator misses all headings after empty 
segment
                 Key: CTAKES-450
                 URL: https://issues.apache.org/jira/browse/CTAKES-450
             Project: cTAKES
          Issue Type: Bug
          Components: ctakes-core
            Reporter: Kean Kaufmann
         Attachments: CDASegmentAnnotator.diff

If the CDASegmentAnnotator encounters an empty segment, it throws away 
everything after that in the document.  You can see this in the test document 
provided for TestCDASegmentAnnotator. The heading "CURRENT HEALTH STATUS" is 
followed immediately by the heading "Medications"; the test case misses the 
"Medications" heading, and "FAMILY HISTORY" after that. The sorted_segments 
loop is only incrementing the index variable for non-empty segments.

Patch attached.

TestCDASegmentAnnotator output before fix (with getPreferredText()):

Segment:2.16.840.1.113883.10.20.22.1.1 Begin:92 End:159: Header
Segment:1.3.6.1.4.1.19376.1.5.3.1.1.13.2.1 Begin:176 End:1612: CHIEF COMPLAINT
Segment:2.16.840.1.113883.10.20.22.2.20 Begin:1634 End:1696: HISTORY OF PAST 
ILLNESS
Segment:2.16.840.1.113883.10.20.22.2.2.1 Begin:1711 End:2271: History of 
immunizations

After fix:

Segment:2.16.840.1.113883.10.20.22.1.1 Begin:92 End:159: Header
Segment:1.3.6.1.4.1.19376.1.5.3.1.1.13.2.1 Begin:176 End:1612: CHIEF COMPLAINT
Segment:2.16.840.1.113883.10.20.22.2.20 Begin:1634 End:1696: HISTORY OF PAST 
ILLNESS
Segment:2.16.840.1.113883.10.20.22.2.2.1 Begin:1711 End:2271: History of 
immunizations
Segment:2.16.840.1.113883.10.20.22.2.1.1 Begin:2307 End:3506: HISTORY OF 
MEDICATION USE
Segment:2.16.840.1.113883.10.20.22.2.15 Begin:3522 End:5608: Family History




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to