Hi Nathan, Is the implementation of that method any better than iterating over the string and counting the number of code points? I think the last time I noticed this bug in the code I resisted fixing it because of the negative performance impact on the majority of input which only contains characters in BMP.
Thanks. Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] wrote on 08/05/2008 09:34:35 PM: > This might be an additional impetus to move the code base for future > development to Java 5 libraries, so things like String. > codePointCount can be used. > > -Nathan > On Tue, Aug 5, 2008 at 7:59 PM, Michael Glavassevich <[EMAIL PROTECTED] > > wrote: > Hi Taki, > > It's a long standing bug/limitation. Xerces uses String.length() > (which returns the length of the string in chars rather than Unicode > code points) for checking the length facet. > > Thanks. > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: [EMAIL PROTECTED] > E-mail: [EMAIL PROTECTED] > > "Taki Kamiya" <[EMAIL PROTECTED]> wrote on 08/05/2008 08:20:38 PM: > > > > Hi, > > > > The following schema, which is supposedly valid, results in this error: > > > > cvc-length-valid: Value '𠀋' with length = '2' is not facet-valid > > with respect to length '1' > > for type '#AnonType_act'. > > > > The default value "𠀋" for attribute "a" is a single non- > BMP character. > > It is as though a surrogate pair is counted as two characters. > > > > Regards, > > > > -taki > > > > > > > > <xsd:schema targetNamespace="urn:foo" > > xmlns:xsd="http://www.w3.org/2001/XMLSchema" > > xmlns:foo="urn:foo"> > > > > <xsd:complexType name="ct"> > > <xsd:attribute name="a" default="𠀋"><!-- single character > > in SIP (U+2000B) --> > > <xsd:simpleType> > > <xsd:restriction base="xsd:string"> > > <xsd:length value="1"/> > > </xsd:restriction> > > </xsd:simpleType> > > </xsd:attribute> > > </xsd:complexType> > > > > </xsd:schema> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED]