Thanks for the prompt reply. Since I'm using java, and my database is MySQL, do you know what I should use to output this character (Unicode: 0x1e) correctly within the CDATA section?
If there's no easy solution, does that mean I have to filter out these funky characters before outputting them in the CDATA section? Max O Bowsher wrote: > > pmkwan wrote: >> Can someone please explain why the parser is throwing this error: >> >> xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1e) was >> found in the CDATA section. >> at >> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown >> Source) >> at >> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown >> Source) >> at >> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown >> Source) >> at >> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown >> Source) >> at >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown >> Source) >> at >> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown >> Source) >> >> >> I am using <?xml version="1.0" encoding="UTF-8"?> in my xml file and I >> set >> my outputStreamWriter to use UTF-8 as well. The data I captured was from >> our database and the character set is probably not UTF-8. Does that >> matter? > > Yes, it does matters. > >> I thought the parser is not supposed to parse anything within the CDATA >> section in the xml file. So why would this exception even happened? > > Bytes are parsed into characters. Characters are then parsed for XML > markup. CDATA only inhibits the second of those two processes. > > i.e., CDATA sections still must contain valid data according to the > character set of the document, and furthermore, the characters must fall > within the subset of characters permitted in XML. > > There is no syntax that allows you to embed raw bytes within an XML > document. > > Max. > > > > -- View this message in context: http://www.nabble.com/An-invalid-XML-character-%28Unicode%3A-0x1e%29-was-found-in-the-CDATA-section-tf4233631.html#a12045428 Sent from the Xerces - J - Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]