Xerces2 vs Xerces1 Element Text Parsing Implementation

Fuzzo Wed, 22 Oct 2008 00:54:52 -0700

Hi all!

Let me explain the problem with an example.
I have to parse an XML in this form:


<anomaly id="0012" severity="4">some_text_with_%_symbol</anomaly>

With Xerces1 SAX parser, the element text (some_text_with_%A7_symbol) is
parsed in one solution with full length invoking the characters(char[] ch,
int start, int length) method.

With Xerces2, the element text is parsed in 30 bytes slot and the method is
invoked some times until the text element is fully parsed.

Now, in my application the text element is sometimes encoded with
java.net.URLEncoder class and then decoded with java.net.URLDecoder.

With Xerces2, happens that the element substring can be in form of
first_part_of_text_% and URLDecoder can't handle correctly the final % char,
giving me a URLDecoder: Incomplete trailing escape (%) pattern because it
does not find the 2 following chars (ex.: %A7 means the § symbol in Cp1252
encoding).

There is a way to configure Xerces2 to parse text elements in only one
solution?

Many thanks!


-- 
View this message in context: 
http://www.nabble.com/Xerces2-vs-Xerces1-Element-Text-Parsing-Implementation-tp20105730p20105730.html
Sent from the Xerces - J - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Xerces2 vs Xerces1 Element Text Parsing Implementation

Reply via email to