Thanks for the prompt reply.  Since I'm using java, and my database is MySQL,
do you know what I should use to output this character (Unicode: 0x1e)
correctly within the CDATA section?

If there's no easy solution, does that mean I have to filter out these funky
characters before outputting them in the CDATA section?



Max O Bowsher wrote:
> 
> pmkwan wrote:
>> Can someone please explain why the parser is throwing this error:
>> 
>> xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1e) was
>> found in the CDATA section.
>>      at
>> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown
>> Source)
>>      at
>> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
>> Source)
>>      at
>> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
>> Source)
>>      at
>> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown
>> Source)
>>      at
>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown
>> Source)
>>      at
>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
>> Source)
>> 
>> 
>> I am using <?xml version="1.0" encoding="UTF-8"?> in my xml file and I
>> set
>> my outputStreamWriter to use UTF-8 as well.  The data I captured was from
>> our database and the character set is probably not UTF-8.  Does that
>> matter?
> 
> Yes, it does matters.
> 
>> I thought the parser is not supposed to parse anything within the CDATA
>> section in the xml file.  So why would this exception even happened?
> 
> Bytes are parsed into characters. Characters are then parsed for XML
> markup. CDATA only inhibits the second of those two processes.
> 
> i.e., CDATA sections still must contain valid data according to the
> character set of the document, and furthermore, the characters must fall
> within the subset of characters permitted in XML.
> 
> There is no syntax that allows you to embed raw bytes within an XML
> document.
> 
> Max.
> 
> 
>  
> 

-- 
View this message in context: 
http://www.nabble.com/An-invalid-XML-character-%28Unicode%3A-0x1e%29-was-found-in-the-CDATA-section-tf4233631.html#a12045428
Sent from the Xerces - J - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to