Hi, Fuzzo <[EMAIL PROTECTED]> wrote on 10/22/2008 03:54:18 AM:
> Hi all! > > Let me explain the problem with an example. > I have to parse an XML in this form: > > <anomaly id="0012" severity="4">some_text_with_%_symbol</anomaly> > > With Xerces1 SAX parser, the element text (some_text_with_%A7_symbol) is > parsed in one solution with full length invoking the characters(char[] ch, > int start, int length) method. > > With Xerces2, the element text is parsed in 30 bytes slot and the method is > invoked some times until the text element is fully parsed. > > Now, in my application the text element is sometimes encoded with > java.net.URLEncoder class and then decoded with java.net.URLDecoder. > > With Xerces2, happens that the element substring can be in form of > first_part_of_text_% and URLDecoder can't handle correctly the final % char, > giving me a URLDecoder: Incomplete trailing escape (%) pattern because it > does not find the 2 following chars (ex.: %A7 means the ยง symbol in Cp1252 > encoding). > > There is a way to configure Xerces2 to parse text elements in only one > solution? No. characters() may be called multiple times [1][2] for contiguous text. You cannot assume it will only be called once. Your ContentHandler needs to accumulate the text returned in each call of characters() until you receive a callback that isn't characters. > Many thanks! > > > -- > View this message in context: http://www.nabble.com/Xerces2-vs- > Xerces1-Element-Text-Parsing-Implementation-tp20105730p20105730.html > Sent from the Xerces - J - Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] Thanks. [1] http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int) [2] http://xerces.apache.org/xerces2-j/faq-sax.html#faq-2 Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED]