Hi,
 
After thinking the choices you presented I tested encoding the character
as , and that works.  Is this the correct thing to do?
 
Ed

________________________________

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 25, 2008 2:53 PM
To: j-users@xerces.apache.org
Subject: Re: Problem parsing attribute with character 7F



7F is a legal XML 1.0 character and XML 1.0 should accept it. And, yes,
I believe that in UTF8 (are you SURE you're reading the file as UTF8
rather than some other encoding?) it should be a legitimate single byte.


However, the XML 1.0 spec's section 2.2 says "Document authors are
encouraged to avoid "compatibility characters", as defined in section
6.8 of [Unicode] <http://www.w3.org/TR/REC-xml/#Unicode>  (see also D21
in section 3.6 of [Unicode3] <http://www.w3.org/TR/REC-xml/#Unicode3>
)." So using this character is ill-advised, even though it is legal.

XML 1.1 does generally accept more characters than XML 1.0 does -- but
note that 7F is one of the RestrictedChars, and that XML 1.1 explicitly
says these may not appear within documents or external parsed entities.
(Alas, the Recommendation does not explain why these are restricted.)

Looks to me like you have several choices: Eliminate that character,
encode it somehow (note that a numeric character reference probably
wouldn't solve this problem), or go back to XML 1.0.

______________________________________
"... Three things see no end: A loop with exit code done wrong,
A semaphore untested, And the change that comes along. ..."
-- "Threes" Rev 1.1 - Duane Elms / Leslie Fish
(http://www.ovff.org/pegasus/songs/threes-rev-11.html)

Reply via email to