Hi Taki, It's a long standing bug/limitation. Xerces uses String.length() (which returns the length of the string in chars rather than Unicode code points) for checking the length facet.
Thanks. Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] "Taki Kamiya" <[EMAIL PROTECTED]> wrote on 08/05/2008 08:20:38 PM: > Hi, > > The following schema, which is supposedly valid, results in this error: > > cvc-length-valid: Value '𠀋' with length = '2' is not facet-valid > with respect to length '1' > for type '#AnonType_act'. > > The default value "𠀋" for attribute "a" is a single non-BMP character. > It is as though a surrogate pair is counted as two characters. > > Regards, > > -taki > > > > <xsd:schema targetNamespace="urn:foo" > xmlns:xsd="http://www.w3.org/2001/XMLSchema" > xmlns:foo="urn:foo"> > > <xsd:complexType name="ct"> > <xsd:attribute name="a" default="𠀋"><!-- single character > in SIP (U+2000B) --> > <xsd:simpleType> > <xsd:restriction base="xsd:string"> > <xsd:length value="1"/> > </xsd:restriction> > </xsd:simpleType> > </xsd:attribute> > </xsd:complexType> > > </xsd:schema> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED]