I use the BsvRegexSectionizer like so: //This goes in your piper file add BsvRegexSectionizer SectionsBsv=resources/your_regex_file.bsv
your_regex_file.bsv is a text file where each line is a bar-separated set of: section_name||(regex statement to identify the section header) I work in radiology, so a (poorly written!) example of a bsv file (with lots of room for improvement and efficiency!) to find radiology report sections headers (Comparison, ExamDate, Findings, Impression) might be: Comparison||(?:^\s*COMPARISON:|^\s*CORRELATION:) ExamDate||(?:^\s*DATE OF STUDY:|^\s*DATE:|^\s*Date of procedure:|^\s*EXAM DATE:|^\s*EXAMINATION DATE:) Findings||(?:^\s*FINDINGS:|^\s*Findings:) Impression||(?:^\s*IMPRESSION:|^\s*Impression:) Tom -----Original Message----- From: Peter Abramowitsch <pabramowit...@gmail.com> Sent: Tuesday, March 22, 2022 10:07 PM To: dev@ctakes.apache.org Subject: Re: Segment annotation type Hi Greg I don't bother about segments but have been pretty successful using this to get a document's sections. *add org.mitre.medfacts.uima.ZoneAnnotator SectionRegex=org/mitre/medfacts/uima/section_regex.xml* Have you checked out this annotator? It creates "Heading" types and the config file above is a good place to start from. It has a nice ability to normalize section types so that if note type A and B both have assessment sections that are somewhat titled differently, you can have them both tagged with the same label. The annotator had some rough edges and unwanted printing in the log which I've recently modified. Also I did some optimization of the code which was wasting compute cycles by re-initializing itself for every document. I can check it in, but you can get a good flavor of it by trying what's in the codebase already. Peter On Tue, Mar 22, 2022 at 6:33 PM Greg Silverman <g...@umn.edu.invalid> wrote: > How do I modify org.apache.ctakes.typesystem.type.textspan.Segment to > actually create annotations for document segments/sections? > > Also, how do I disable annotations for the SemanticRoleRelation > annotation type? > > Thanks! > > Greg-- > > -- > Greg M. Silverman > Senior Systems Developer > NLP/IE > <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhea > lthinformatics.umn.edu%2Fresearch%2Fnlpie-group&data=04%7C01%7Ctwl > oehfelm%40ucdavis.edu%7C0d75f9e5cc6145a9a76a08da0c8b1460%7Ca8046f6466c > 04f009046c8daf92ff62b%7C0%7C0%7C637836088748655888%7CUnknown%7CTWFpbGZ > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3 > D%7C3000&sdata=gnPekLVARx28BK8tpaKw%2FFD6J120OMu4vCNZnrGT%2FXU%3D& > amp;reserved=0> > Department of Surgery > University of Minnesota > g...@umn.edu > **CONFIDENTIALITY NOTICE** This e-mail communication and any attachments are for the sole use of the intended recipient and may contain information that is confidential and privileged under state and federal privacy laws. If you received this e-mail in error, be aware that any unauthorized use, disclosure, copying, or distribution is strictly prohibited. If you received this e-mail in error, please contact the sender immediately and destroy/delete all copies of this message.