Hi Taki,

It's a long standing bug/limitation. Xerces uses String.length() (which
returns the length of the string in chars rather than Unicode code points)
for checking the length facet.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

"Taki Kamiya" <[EMAIL PROTECTED]> wrote on 08/05/2008 08:20:38 PM:

> Hi,
>
> The following schema, which is supposedly valid, results in this error:
>
>   cvc-length-valid: Value '𠀋' with length = '2' is not facet-valid
> with respect to length '1'
>   for type '#AnonType_act'.
>
> The default value "&#x2000B;" for attribute "a" is a single non-BMP
character.
> It is as though a surrogate pair is counted as two characters.
>
> Regards,
>
> -taki
>
>
>
> <xsd:schema targetNamespace="urn:foo"
>            xmlns:xsd="http://www.w3.org/2001/XMLSchema";
>            xmlns:foo="urn:foo">
>
> <xsd:complexType name="ct">
>   <xsd:attribute name="a" default="&#x2000B;"><!-- single character
> in SIP (U+2000B) -->
>     <xsd:simpleType>
>       <xsd:restriction base="xsd:string">
>         <xsd:length value="1"/>
>       </xsd:restriction>
>     </xsd:simpleType>
>   </xsd:attribute>
> </xsd:complexType>
>
> </xsd:schema>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to