I’ve encountered that when the input text file has control characters, for example ^M
The fix I used was to remove all control characters from the input text files ahead of time via python. Best, John Caskey UW-Madison jrcas...@wisc.edu ________________________________ From: Greg Silverman <g...@umn.edu.INVALID> Sent: Sunday, March 6, 2022 12:40:00 PM To: dev@ctakes.apache.org <dev@ctakes.apache.org> Subject: Issue with serializable XML Got the error during processing of a large set of documents about mid-way through: org.xml.sax.SAXParseException: Trying to serialize non-XML 1.0 character: , 0x1c I encountered this once before, but I don't remember what the fix was. Running apache-ctakes-4.0.1-SNAPSHOT. Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> Department of Surgery University of Minnesota g...@umn.edu